On the Legal Front: The Supreme Court Ruling in Ricci v. Destefano
Art Gutman
Florida Institute of Technology
Eric Dunleavy
DCI Consulting
On June 29, 2009 the Supreme Court ruled 5–4 in favor of 18 plaintiffs that challenged the actions of the New Haven Civil Service Board (CSB), which discarded promotion exams for lieutenant and captain firefighter jobs after the exams were administered and scored. There were four opinions. The majority ruling was written by Kennedy (for Alito, Roberts, Scalia, and Thomas) and a dissenting opinion was written by Ginsburg (for Breyer, Souter, and Stevens). In addition, there was a concurrence by Alito (for Scalia and Thomas) addressing Ginsburg’s dissenting opinion and a concurrence by Scalia who, speaking for himself, questioned whether adverse impact rules are legal under the 14th Amendment. We will focus on the majority and dissenting opinions, as well as Scalia’s concurrence.
To save space, we will not dwell here on the facts of the case. We did this in the April 2009 issue.1 The bottom line is that the district court ruled for the CSB and the 2nd Circuit affirmed. However, in a move that is rarely seen in Supreme Court cases, Justice Kennedy not only reversed the lower court rulings, he also granted summary judgment to the plaintiffs. Therefore, this case is over, except that the district court must decide on the remedies for Frank Ricci and his 17 co-plaintiffs.
1 In addition, we encourage readers to go to the SIOP Exchange blog site where there are opinions by SIOP members both before and after the ruling and to a column by our SIOP publicists (Clif Boutelle & Stephany Schings) on solicited opinions by various SIOP Fellows and Members (go to “View Items” on the SIOP.org face page).
The results of the promotion tests are depicted in the table below. We presented these results in our April 2009 column. Based on a “rule of three,” the top nine scores were eligible for promotion for captain and the top 10 scores were eligible for promotion to lieutenant. The bottom line is that zero Blacks and Hispanics were eligible for promotion to lieutenant, and zero Blacks and two Hispanics were eligible for promotion to captain in the first round of promotions.
Captain exam (7 vacancies)
______________________________________________
Whites Blacks Hispanics
______________________________________________
Applicants
Passing score
Top 9 scores |
25
16
7 |
8
3
0 |
8
3
2 |
______________________________________________
Lieutenant exam (8 vacancies)
______________________________________________
Whites Blacks Hispanics
______________________________________________
Applicants
Passing score
Top 10 scores |
43
25
10 |
19
6
0 |
15
3
0 |
If you Google this case, you will find numerous news reports and blogs sensationalizing the ruling. For example, as noted by Eric on the SIOP Exchange, you will see many opinions relating to the “end to affirmative action, changes to the job-relatedness burden under Title VII, the end to adverse impact as we know it, etc.” These opinions do not square with the ruling itself. First and foremost, this was a disparate treatment case not an adverse impact case. The key issue was whether the CSB had a legal motive for discarding the exam results and, if not, whether there was a “strong basis in evidence” for having a racial motive to discard the test. Justice Kennedy’s ruling was yes, there was a racial motive, and no, there was no strong basis for having this motive. Sounds strange? Hang in there, we’re just starting.
Ordinarily, disparate treatment defenses are relatively easy for defendants; they simply articulate (without factual proof) a nondiscriminatory reason for what they did. The CSB articulated their fear of losing a potential adverse impact claim to minority applicants, particularly Blacks. Usually, after the articulation, the burden passes to the plaintiffs (Ricci & 17 others) to prove that the articulation is a pretext for discrimination (see McDonnell Douglas v. Green, 1973).2 But this was no ordinary disparate treatment case. Because of the “tension” between disparate treatment and adverse impact, Justice Kennedy placed a heavier burden of proof (i.e., the strong basis in evidence instead of simple articulation) on the CSB and ruled that CSB failed to carry that burden.
2 For example, in McDonnell Douglas v. Green, Percy Green complained he was not rehired by the company because of his race. McDonnell Douglas articulated (without providing any evidence at all) that Green was not rehired because of illegal activities against the company during a prior layoff. It was Green’s burden to prove that the articulation was a pretext for discrimination (which he failed to do), not the companies burden to prove its articulation was true and nondiscriminatory.
Adverse Impact
Before delving into the disparate treatment ruling, let’s take a step back and imagine that the CSB certified the test and the minority firefighters sued. What might have happened in a hypothetical adverse impact case?
Prima Facie Phase
The first step (or phase) of an adverse impact case is statistical proof of adverse impact itself (i.e., a disproportionate affect of the decision not to certify on minority applicants). Justice Kennedy cited both the 80% rule (on test pass rates) and the zero rate of promotion of Blacks. On the 80% rule, Kennedy stated the following:
The racial adverse impact here was significant, and petitioners do not dispute that the City was faced with a prima facie case of disparate-impact liability. On the captain exam, the pass rate for white candidates was 64 percent but was 37.5 percent for both black and Hispanic candidates. On the lieutenant exam, the pass rate for white candidates was 58.1 percent; for black candidates, 31.6 percent; and for Hispanic candidates, 20 percent. The pass rates of minorities, which were approximately one-half the pass rates for white candidates, fall well below the 80-percent standard set by the EEOC to implement the disparate-impact provision of Title VII. See 29 CFR §1607.4(D) (2008).
On the zero rate of Black promotion, Kennedy also noted that “the City could not have considered Black candidates for any of the then-vacant lieutenant or captain positions.”
We have issues with relying so heavily on the 80% rule, and for only applying it to a nominal passing score (of 70), but we will save that for another column.3 For present purposes, it is reasonable to believe the minority applicants, particularly the Black applicants, were adversely impacted by promotion by the exam.
3 In a nutshell, we think a nominal passing score (in this case 70) is less relevant; the more important score if the effective passing score (the lowest score that was eligible for promotion). Also, the Uniform Guidelines point to the 80% rule as a rule of thumb, and there is plenty of case law on issue relating to sample sizes and statistically significant differences (see for example US v. City of New York, 2009 for an extensive discussion of statistical arguments relating to adverse impact).
Defense Phase
The next step would be for CSB to prove that the tests were job-related and consistent with business necessity (i.e., valid). IOS, the consulting firm that created the exam, used a content-related validity strategy. The precedent within the 2nd Circuit for content validity is from Guardians of New York v. Civil Service Commission (CSC) (1980), in which the following five criteria for content validity were expressed:
1. Suitable job analysis
2. Reasonable competence in test construction
3. Test content related to job content
4. Test content representative of job content
5. Scoring systems selecting applicants who are likely to be better job performers
These criteria have been used by other circuit courts, including the 7th Circuit (Gillespie v Wisconsin, 1985) and the 6th Circuit (Police Officers v. City of Columbus, 1990), and more recently, were again endorsed by the 2nd Circuit in Gulino v. New York State Education Department (2006).
Left to ourselves (i.e., the I-O profession), there would undoubtedly be disagreement on whether these criteria were satisfied in the IOS exam. For example, in a brief written by five SIOP Fellows,4 it was argued that the job analysis was not suitable (#1) and that there was criterion deficiency (#3) because of a failure to measure “command presence.” Others would disagree.5 This is an important debate, and we should have it—but not here. There is also a gulf between what we as a profession believe and what courts accept. We need to educate the courts with respect to our SIOP Principles (see Landy, 2005)—but not here. The fact is that based on other cases the 2nd Circuit may have supported the content validity of the test, and the Supreme Court would likely have affirmed. In the words of Justice Kennedy:
The City’s assertions that the exams at issue were not job related and consistent with business necessity are blatantly contradicted by the record, which demonstrates the detailed steps taken to develop and administer the tests and the painstaking analyses of the questions asked to assure their relevance to the captain and lieutenant positions. The testimony also shows that complaints that certain examination questions were contradictory or did not specifically apply to firefighting practices in the City were fully addressed, and that the City turned a blind eye to evidence supporting the exams’ validity.
Therefore, the CSB may well have been successful in defending the IOS exam based on content validity.
4 They include Herman Aguinis, Wayne Cascio, Irwin Golstein, James Outtz, and Sheldon Zedeck.
5 For example, in the column by Boutelle & Schings, Wayne Cascio speaks to the failure of IOS to tap “command presence” and to the merits of assessment centers. Gerald Barrett counters that written tests are better and “command presence” is not that important a KSA.
This is not to say that the Guardians standard for content validity is a soft touch. Indeed, in the Guardians case, the defendants lost based on two of the criteria. Also, in US v. City of New York (July 22, 2009), a case featuring adverse impact on minorities of an entry-level written test for firefighters, Judge Garaufis, of the Eastern District of New York, ruled that the test failed all five Guardians criteria, and, therefore, was not content valid. This was a post-Ricci ruling in which the judge referenced Ricci but ruled that Ricci does not dictate the outcome of the case. Accordingly:
I reference Ricci not because the Supreme Court’s ruling controls the outcome in this case; to the contrary, I mention Ricci precisely to point out that it does not. In Ricci, the City of New Haven had set aside the results of a promotional examination, and the Supreme Court confronted the narrow issue of whether New Haven could defend a violation of Title VII’s disparate treatment provision by asserting that its challenged employment action was an attempt to comply with Title VII’s disparate impact provision.… In contrast, this case presents the entirely separate question of whether Plaintiffs have shown that the City’s use of Exams 7029 and 2043 has actually had a disparate impact upon black and Hispanic applicants for positions as entry-level firefighters. Ricci did not confront that issue.
In fact, Ricci would not control the outcome of Ricci itself as an adverse impact case. Nevertheless, Judge Garaufis speculated that “The Ricci Court concluded that New Haven would not likely have been liable under a disparate impact theory.” Probably, but there are no guarantees, particularly given the pretext phase described below. Judge Garaufis analyzed the New York test in great detail (45 pages). Had he examined the New Haven test in kind, he would have read arguments such as those provided by the SIOP Fellows and might have ruled differently. However, for present purposes, let’s assume the IOS exams would be deemed content valid at the district court level based on Guardians.
Pretext Phase
The last step in an adverse impact case, as noted by Justice Kennedy, is for the plaintiff to prove “there existed an equally valid, less discriminatory alternative that served the City’s needs but that the City refused to adopt.” In fact, this has been accomplished in two recent district court cases, Bradley v. City of Lynn (2006) and Johnson v. City of Memphis (2006).6
6 There is, however, disagreement on the merits of these two cases as expressed by Sharf and Outtz in companion articles written for the October 2007 issue of TIP.
In Bradley, a written test was the only basis for selecting entry-level firefighters. The judge ruled there was adverse impact, and there was insufficient evidence of job relatedness. More importantly for present purposes, the judge also ruled there were two valid alternatives with less adverse impact: (a) a combination of cognitive tests and physical abilities, and (b) a combination of cognitive tests with personality tests and biodata. The judge ruled that “while none of these approaches alone provides the silver bullet, these other non-cognitive tests operate to reduce the disparate impact of the written cognitive examination.”
Johnson is a more compelling case because the judge ruled that a promotion exam (for police sergeant) was reliable and valid, but the plaintiffs won on alternatives with less impact. Critically, the city used a valid promotion exam in 1996 supervised by a DOJ appointed expert. There were four components, including a written test (weighted 20%), performance evaluations (20%), seniority (10%), and a video-based practical test (50%). However, in a subsequent promotion exam administered in 2002, there was no practical test. In addition, integrity tests were cited as a reasonable alternative based on the personnel selection literature. Interestingly, the judge ruled “It is of considerable significance that the City had achieved a successful promotional program in 1996 and yet failed to build upon that success.”
Would something like that have happened in the pretext phase if Ricci was an adverse impact case? We’ll never know. The only evidence of alternatives considered was, in essence, hearsay. The CSB had a phone “consultation” with a competitor of IOS (Dr. Hornick). Among other things, Hornick suggested that assessment centers would produce less adverse impact. However, he never reviewed the IOS test and said other things that, in effect, supported it as a reasonable assessment. Therefore, in the words of Justice Kennedy:
Hornick stated his “belie[f]” that an “assessment center process,” which would have evaluated candidates’ behavior in typical job tasks, “would have demonstrated less adverse impact.”…But Hornick’s brief mention of alternative testing methods, standing alone, does not raise a genuine issue of material fact that assessment centers were available to the City at the time of the examinations and that they would have produced less adverse impact. Other statements to the CSB indicated that the Department could not have used assessment centers for the 2003 examinations …And although respondents later argued to the CSB that Hornick had pushed the City to reject the test results…the truth is that the essence of Hornick’s remarks supported its certifying the test results….Hornick stated that adverse impact in standardized testing “has been in existence since the beginning of testing…and that the disparity in New Haven’s test results was “somewhat higher but generally in the range that we’ve seen professionally.”…He told the CSB he was “not suggesting” that IOS “somehow created a test that had adverse impacts that it should not have had….”And he suggested that the CSB should “certify the list as it exists.”
Some will read this and opine that Kennedy ruled there were no equally valid alternatives—that’s not quite true. What Kennedy did rule was that the CSB had no strong basis in evidence at the time the exams were discarded to believe there were no valid alternatives. Instead, there was what Kennedy called a good faith belief based on Dr. Hornick’s phone conversation. For example, if the CSB had the evidence provided by the SIOP Fellows in their amicus brief at a formative stage in development of the exams, that could have served as a strong basis for believing there were equally valid alternatives with less adverse impact, irrespective of counterarguments. Of course, if they had such information and acted upon it, they probably would have never have administered the IOS exams to begin with. In essence, Kennedy reduced the evidence to the information available at the time of the city’s decision to cancel the promotions.
On the other hand, the timing of available information would be much less important in an adverse impact claim. The arguments, pro and con, relating to assessment centers could be made in the third phases of the adverse impact scenario, in which case we would have a battle of experts (call it “War of the Gladiators”). The ultimate ruling, of course, would depend on which side the district court judge favored.
Alas, none of the aforementioned happened. For that reason, we believe there were no major precedents established in Ricci directly relating to adverse impact per se.
Justice Kennedy’s Ruling
However, the implications for disparate treatment claims are enormous and, as noted above, beyond the ordinary. Let’s start with the ruling itself.
There was no question of a racial motive for discarding the test. In Kennedy’s words:
Whatever the City’s ultimate aim—however well intentioned or benevolent it might have seemed—the City made its employment decision because of race. The City rejected the test results solely because the higher scoring candidates were white. The question is not whether that conduct was discriminatory but whether the City had a lawful justification for its race-based action.
Even a cursory reading of the district court ruling would lead a reasonable observer to believe that if the outcome was different (i.e., more minority promotions), the exams would have been certified. That constitutes a racial motive that is not, per se, illegal, but requires justification (i.e., strong basis in evidence).
Kennedy felt it was necessary to balance the tension between disparate treatment and adverse impact. He first rejected a “certainty” criterion. Accordingly:
Forbidding employers to act unless they know, with certainty, that a practice violates the disparate-impact provision would bring compliance efforts to a near standstill. Even in the limited situations when this restricted standard could be met, employers likely would hesitate before taking voluntary action for fear of later being proven wrong in the course of litigation and then held to account for disparate treatment.
Kennedy also rejected a “good-faith” argument (based on Dr. Hornick’s input). Accordingly:
Allowing employers to violate the disparate-treatment prohibition based on a mere good-faith fear of disparate-impact liability would encourage race-based action at the slightest hint of disparate impact. A minimal standard could cause employers to discard the results of lawful and beneficial promotional examinations even where there is little if any evidence of disparate-impact discrimination. That would amount to a de facto quota system, in which a “focus on statistics”...could put undue pressure on employers to adopt inappropriate prophylactic measures.
Kennedy also incorporated the race-norming provision in the Civil Rights Act of 1991 (CRA-91) into his ruling. Accordingly:
If an employer cannot rescore a test based on the candidates’ race, §2000e-2(l), then it follows a fortiori that it may not take the greater step of discarding the test altogether to achieve a more desirable racial distribution of promotion-eligible candidates—absent a strong basis in evidence that the test was deficient and that discarding the results is necessary to avoid violating the disparate-impact provision.
Kennedy then articulated the “strong-basis-in-evidence” standard as compromise between the conflicting demands of disparate treatment and adverse impact. Accordingly:
For the foregoing reasons, we adopt the strong-basis-in-evidence standard as a matter of statutory construction to resolve any conflict between the disparate-treatment and disparate-impact provisions of Title VII.
This sounds like rational thinking, but relative to prior Supreme Court precedents, it is an anomalous and confusing ruling.
So Where Are the Anomalies?
The anomalies have nothing to do with the Supreme Court’s ultimate ruling favoring Ricci et. al. Rather, they have to do with the reasoning Kennedy used.
First, the “strong-basis-in-evidence” standard was first articulated by Justice Powell in Wygant v. Jackson Board of Education (1986) and later supported by Justice O’Connor in City of Richmond v. Croson (1989) and Adarand v. Pena (1995) as a basis for proving there is evidence of a remedial need that requires a remedy. These were all reverse discrimination cases decided under constitutional principles (14th Amendment in Wygant and Croson, 5th Amendment in Adarand). This fact was criticized by Justice Ginsburg in her dissent (see below). However, what’s anomalous to us is that the 14th Amendment has a time-honored “strict scrutiny” analysis for such cases, and Title VII has a parallel test from United Steelworkers v. Weber (1979). Either of these two analyses could have been more easily used to resolve the case.
The strict scrutiny test requires (a) a compelling government interest and (b) a narrowly tailored solution to that interest. The Weber test requires (a) an egregious violation and (b) a temporary nontrammeling solution. The racial motive in discarding the test without a “strong basis in evidence” satisfies the first prong in both tests and the act of discarding the test is not narrowly tailored under constitutional principles or nontrammeling under the Weber test. There was no need to couch the ruling in terms of a conflict between disparate treatment and adverse impact; all that language seems to us as being unnecessary.
Second, the majority limited the Ricci ruling to Title VII. By itself, this is no big deal. However, Justice Kennedy implied that the Title VII and 14th (and 5th) Amendment rules may not be the same. Indeed, he wrote:
Our statutory holding does not address the constitutionality of the measures taken here in purported compliance with Title VII. We also do not hold that meeting the strong-basis-in-evidence standard would satisfy the Equal Protection Clause in a future case. As we explain below, because respondents have not met their burden under Title VII, we need not decide whether a legitimate fear of disparate impact is ever sufficient to justify discriminatory treatment under the Constitution.
That sets up the prospect that if an employer does satisfy the “strong-basis-in-evidence” standard under Title VII, it could be revoked under constitutional principles.
Third, the only mention of Hayden v. Nassau County (1999) was made as passing references in two parts of Justice Ginsburg’s dissent. Strangely, it was not mentioned in the majority ruling. We will not dwell on Hayden here; we discussed that case in detail in the April 2009 column. However, Hayden is a poster child for messing with test composition and at the time was even more controversial than the Ricci ruling (and may still be). Nassau County clearly played with the composition of its entry-level test for police officers until it got the outcome it thought was most valid with the least amount of adverse impact. Clearly, there was a racial motive. But there was an important difference between Hayden and Ricci. Nassau County was under court order to create a valid test, and the DOJ orchestrated the creation of a “blue ribbon panel” to do so (strong basis in evidence?). We find it hard to believe that the Supreme Court majority entertained Ricci and ignored Hayden.
In short, there was a preexisting mechanism for dealing with reverse discrimination claims under constitutional provisions and Title VII, meaning that there may not have been a need for pitting disparate treatment against disparate impact or pitting Title VII against the 5th and 14th Amendments.
Justice Scalia’s Concurrence
Although alone in his opinion, Justice Scalia suggested that the Supreme Court needs to at some future point determine whether the adverse impact rules in CRA-91 are themselves legal under constitutional principles. Scalia raised an interesting point that many have discussed over the years, though rarely in print. He stated:
It might be possible to defend the law by framing it as simply an evidentiary tool used to identify genuine, intentional discrimination—to “smoke out,” as it were, disparate treatment….But arguably the disparate-impact provisions sweep too broadly to be fairly characterized in such a fashion—since they fail to provide an affirmative defense for good-faith (i.e., nonracially motivated) conduct, or perhaps even for good faith plus hiring standards that are entirely reasonable.
In the landmark Griggs v. Duke Power (1971) ruling, adverse impact was the only principle considered even though there was ample evidence that Duke Power knew the impact that high diplomas and cognitive tests would have on Blacks. We think there is good reason to believe that most companies today are trying their best to be efficient and fair at the same time, but they get no credit for that in an adverse impact case. On the other hand, particularly in police and firefighter cases, unions insert arbitrary rules in collective bargaining agreements that, arguably, favor one group over another. These are two different types of “motives,” but they are treated the same in adverse impact cases.
However, that said, the solution Scalia proposes incorporates an argument that, for all intents and purposes, was made in Wards Cove v. Atonio (1989) and struck down by Congress in CRA-91 (that the defense to adverse impact would be the same as the defense to disparate treatment). Wow! Just what we need: a constitutional battle between Congress and the Supreme Court.
There is an anomaly here as well. In Meacham v. KAPL (2008), the Supreme Court ruled 5–4 to incorporate adverse impact into the ADEA, albeit under different rules than for Title VII. In that case, Justice Scalia was the only one who favored using the EEOC rules for Title VII in age discrimination cases. The anomaly is that he was one of the five in the majority. It seems strange to favor Title VII rules in age cases and, at the same time, question whether adverse impact is a valid principle under the Constitution.
Justice Ginsburg’s Dissent
The dissenting Justices agreed with the district court conclusion that “intent to remedy the impact of a promotional exam is not equivalent to an intent to discriminate against nonminority applicants.” When compared to the majority opinion, the dissenting opinion almost reads as a different case along multiple dimensions. For example, the dissenting justices took into serious consideration the history of discrimination against minorities in municipal fire departments and in New Haven. In fact, Ginsburg’s opinion opens with the following:
In assessing claims of race discrimination, context matters. Grutter v. Bollinger, 539 U.S. 306, 327 (2003). In 1972 Congress extended Title VII of the Civil Rights Act of 1964 to cover public employment. At that time, municipal fire departments across the country, including New Haven’s, pervasively discriminated against minorities. The extension of Title VII to cover jobs in firefighting effected no overnight change. It took decades of persistent effort, advanced by Title VII litigation, to open firefighting posts to members of racial minorities.
In essence, Ginsburg and company viewed the situation in New Haven almost as remedial in nature. This context has implications for determining which burden should be applied in the justification of race-conscious decisions. Although the dissenting justices point out that Ricci is not an affirmative action case, they suggest that Ricci is more similar to a set of affirmative action case law as compared with the equal protection case law that acts as the foundation of the majority opinion. This case law focused on situations where disparate treatment and affirmative action programs were at odds, including Johnson v. Transportation Agency, Santa Clara Cty., (1987). Specifically, the dissenting justices write:
This litigation does not involve affirmative action. But if the voluntary affirmative action at issue in Johnson does not discriminate within the meaning of Title VII, neither does an employer’s reasonable effort to comply with Title VII’s disparate-impact provision by refraining from action of doubtful consistency with business necessity.
Relatedly, Ginsburg included a section on the appropriateness of the strong-basis-in-evidence standard endorsed by the majority. The dissenting justices traced the history of the strong-basis-in-evidence standard used in equal protection cases, pointing out that these cases focused on set asides and absolute racial preferences in school districts and contractor selection. Specifically,
The Court’s standard, drawn from inapposite equal protection precedents, is not elaborated. One is left to wonder what cases would meet the standard and why the Court is so sure this case does not….The cases from which the Court draws its strong-basis in-evidence standard are particularly inapt; they concern the constitutionality of absolute racial preferences. See Wygant v. Jackson Bd. of Ed., 476 U. S. 267, 277 (1986) (plurality opinion) (invalidating a school district’s plan to lay off nonminority teachers while retaining minority teachers with less seniority); Croson, 488 U. S., at 499–500 22 (rejecting a set-aside program for minority contractors that operated as “an unyielding racial quota”). An employer’s effort to avoid Title VII liability by repudiating a suspect selection method scarcely resembles those cases. Race was not merely a relevant consideration in Wygant and Croson; it was the decisive factor. Observance of Title VII’s disparate-impact provision, in contrast, calls for no racial preference, absolute or otherwise.
Thus, the dissent concluded that the strong-basis-in-evidence standard was inappropriately applied by the majority and instead endorsed a lighter “reasonableness” burden closer to affirmative action cases under Title VII.
With regard to justifying the decision to set aside the promotion results, the dissent and majority agreed that the combination of a 4/5th rule violation on the pass/fail rate of the test and an inexorable zero for Black promotions was compelling evidence of prima facie disparity regardless of the standard used. However, the justices disagreed on job relatedness and reasonable alternative justifications, in part because they considered different information in addition to using different standards. Recall that the results of a proactive adverse impact analysis were essentially the justification articulated by the city; however, information on job relatedness and reasonable alternatives expanded after litigation started and until oral argument in front of the Supreme Court. Remember, there was never an adverse impact case to which to refer back.
As described above, the majority considered only the adverse impact, job-relatedness, and reasonable alternative information that was available to the city before they made the decision to throw out the promotion list. The dissenting justices, on the other hand, considered a set of additional information gathered after the decision was made to throw out the promotion list and up until the oral argument in front of the Supreme Court. This included the brief written by the five SIOP Fellows and other resources that identify (a) some “fatal flaws” of the tests and (b) assessment centers as a reasonable alternative to written job knowledge tests for the jobs of interest. Ginsburg also cited various textbooks and scholarly articles from the personnel psychology literature. Specifically, the dissent identified the following flaws in the test and promotion process:
- The city used an arbitrary 60% written/40% oral weighting scheme that was specified in the collective bargaining agreement 2 decades ago. In addition, characteristics of the collective bargaining agreements do not shield Title VII requirements;
- An important ability, command presence, was not measured by the tests. The dissent cited a baseball analogy from other case law to exemplify this issue: Boston Chapter, NAACP, 504 F. 2d, at 1023. (“[T]here is a difference between memorizing...fire fighting terminology and being a good fire fighter. If the Boston Red Sox recruited players on the basis of their knowledge of baseball history and vocabulary, the team might acquire [players] who could not bat, pitch, or catch.”)
- As described by the five SIOP Fellows, the rank order “rule of three” decision-making process the city used for promotions was not supported by statistical evidence.
- As described by the five SIOP Fellows and the personnel psychology literature, assessment centers are a known and reasonable alternative in the context of selection for upper level firefighter jobs.
- In the nearby city of Bridgeport, CT, there is a substantially higher minority percentage in lieutenant and captain positions, and Bridgeport more heavily weights the oral portion of their promotion test relative to the written portion. Thus, a change to the weighting scheme of the test could be a less adverse and reasonable alternative.
Based on the above information, the dissent concluded that the city did not discriminate by throwing out the promotion results because in reality (a) the promotion tests were flawed and (b) there were likely reasonable alternatives available (e.g., an assessment center). Thus, the decision to throw out the results was justified.
There are a few other points to note in the dissenting opinion. First, the justices devote some time responding to the concurring opinion written by Justice Alito. Ginsburg and company question the facts of the case presented by Justice Alito and conclude that political considerations were inappropriately equated with unlawful discrimination. Second, the dissenting justices did note how strange it was that the case was not remanded back to the second circuit, where the lower court would apply different burdens to new context. Specifically:
The Court stacks the deck further by denying respondents any chance to satisfy the newly announced strong basis-in-evidence standard. When this Court formulates a new legal rule, the ordinary course is to remand and allow the lower courts to apply the rule in the first instance. See, e.g., Johnson v. California, 543 U. S. 499, 515 (2005); Pullman-Standard v. Swint, 456 U. S. 273, 291 (1982). I see no good reason why the Court fails to follow that course in this case. Indeed, the sole basis for the Court’s peremptory ruling is the demonstrably false pretension that respondents showed “nothing more” than “a significant statistical disparity.”
Lastly, like in the dissenting opinion in Ledbetter v. Goodyear Tire (2007), Justice Ginsburg questions the “staying power” of the ruling, essentially suggesting that Congress should reverse the decision. However, in Ledbetter there was a single and simple issue (i.e., the timely filing period of a compensation claim) to reverse via statute (i.e., the Ledbetter Fair Pay Act). As this article shows, the Ricci ruling is complex, and we don’t think this is a ruling that can be easily reversed via statute. Perhaps the strong basis in evidence burden could be removed in favor of a lighter reasonableness burden described above, but there may be too many moving parts in for statutory reversal.
Conclusions
We think it’s necessary to try to differentiate the legal implications associated with Ricci from some of the more sensationalized reactions. Of course, cases like US v. New York City CSC (2009) that interpret Ricci precedent (or don’t) will clearly help with this distinction. In addition, there are some major practical implications for I-O psychologists and organizations that use I-O psychologists. As noted in the amicus brief by the five SIOP Fellows, I-O psychologists are an important resource for organizations dealing with test validation. Perhaps the dissenting justices framed the “why” best: “This case presents an unfortunate situation, one New Haven might well have avoided had it utilized a better selection process in the first place.” One lesson in Ricci is to use us and to use us the right way. In other words, organizations should get I-O psychologists involved early on in the test development process and make sure that I-O psychologists are involved throughout the process until adequate research has been conducted and employment decisions based on assessments are being made in reasonable ways.
For example, if the city had allowed a content validity report to be written as initially intended, perhaps the strong basis in evidence burden would have been met.
As another example, consider US v. New York City CSC, where the CSB started with input from a well-known I-O psychologist but chose to go it alone in doing the job analysis and developing the test. That likely contributed strongly to their loss in that case.
In addition, quality control is important, and technical review committees like the independent “blue ribbon panel” used to oversee the test development process in cases like Hayden are a strong example of leveraging independent I-O expertise to produce legally defensible selection procedures. The New Haven CSB would have been well served to have a similar independent panel of I-O psychologists that evaluated the test development process from beginning to end (really from selecting the winning RFP to overseeing validity documentation, employment decision making, a consideration of reasonable alternatives, etc.). This quality control is particularly valuable in situations where litigation is commonplace.
Of course, a successful technical advisory committee requires a group of
I-O psychologists to agree on a test development and validation research agenda, which doesn’t always happen in practice. However, assuming that members of a technical advisory board come to general agreement, this quality control would likely establish a “strong basis in evidence” for all actions taken. Indeed, such a panel would likely have helped the New Haven CSB to consider weighting, cut scores, and alternatives with less impact much earlier in the process. Again, it might have also led to discarding the test before it was administered.
Although a technical review committee provides quality control during all phases of the development process, evaluating the adequacy of an already developed assessment program may also be a valuable service to organizations interested in understanding the adequacy of their selection procedures before it is challenged in court or in an OFCCP audit. This post hoc assessment is often referred to as a human resource program audit. In this context an objective third party with I-O expertise can evaluate the adequacy of assessments and research on those assessments using a set of evaluation criteria. For example, the Uniform Guidelines for Employee Selection Procedures (UGESP, 1978) could be used as a model for assessing the adverse impact of a selection procedure, as well as whether validity research meets various technical standards. Likewise, SIOP’s Principles could be used as evaluation criteria. At the end of the audit, recommendations can be made regarding any additional validation research, refinements to the assessment, search for alternatives, and so on that may be necessary to buttress the adequacy of the HR process.
Proactive HR audits were a topic discussed in multiple presentations at the most recent SIOP conference in New Orleans and consistent with the take home message of those presentations; the Ricci ruling can indirectly be interpreted as endorsing stringent audit processes. Recall that the city essentially conducted a proactive HR audit (e.g., adverse impact analyses, job-relatedness and reasonable alternative considerations via CSB hearings). This evaluation was the information evaluated by the majority of justices, who concluded that this evidence was not compelling enough to justify cancelling the promotions. Of course, the information that the city chose not to document or collect (i.e., a content validity report documenting the test development process) could have made all the difference. Regardless, had the city asked for more complete validity research and a more focused “quality control” audit from an objective party, perhaps problematic test characteristics would have been revised. Of course, it might have also led to strong evidence that justified discarding the test before it was administered, and Ricci v. DeStefano would have been a minor footnote in case law.
Reference
Landy, F. J., (2005). A judge’s view: Interviews with federal judges about expert witness testimony. In F. J. Landy (Ed.) Employment discrimination litigation: Behavioral, quantitative, and legal perspectives (pp. 503–572). San Francisco: Jossey Bass.
Outtz, J. L. (2007). Less adverse alternatives: Making progress and avoiding red herrings. The Industrial-Organizational Psychologist, 45(2), 23–27.
Scharf, J. C. (2007). Slippery slope of “alternatives” altering the topography of employment testing? The Industrial-Organizational Psychologist, 45(2), 13–19.
Uniform Guidelines on Employee Selection Procedures, 29 C.F.R. 1607 et seq. (1978).
Cases Cited
Adarand v. Pena (1995) 515 US 200.
Bradley v. City of Lynn (D.Mass 2006) 443 F. Supp. 2d 145.
City of Richmond v. Croson (1989) 488 US 469.
Gillespie v. State of Wisconsin (CA7 1985) 771 F.2d 1035.
Griggs v. Duke Power Co. (1971) 401 US 424.
Guardians of NY v. Civil Service Commission (CA2 1980) 630 F.2d 79.
Gulino v. New York State Education Department (CA2 2006) 461 F.3d 134.
Hayden v. Nassau County (1999 CA2) 180 F.3d 42.
Johnson v. City of Memphis (2006) U.S. Dist. LEXIS 62823.
Johnson v. Transportation Agency, Santa Clara County, Ca. (1987).
Ledbetter v. Goodyear Tire (2007) 50 U.S. 618.
McDonnell Douglas Corp. v. Green (1973) 411 US 792.
Meacham v. Knolls Atomic Power Laboratory (KAPL) (2008) 128 S.Ct. 2395.
Police Officers for Equal Rights v. City of Columbus (CA6 1990) 916 F.2d 1092.
Ricci v. Destefano (2009) 129 S. Ct. 2658.
United Steelworkers etc. v. Weber (1979) 443 US 193.
US v. City of New York (2009) U.S. Dist. LEXIS 63153.
Wards Cove Packing Company v. Atonio (1989) 490 US 642.
Wygant v. Jackson Board of Education (1986) 476 US 267.