
    Jeff D. PAIGE, individually & on behalf of others similarly situated, Plaintiff-Appellee, v. State of CALIFORNIA; California Highway Patrol; Edward Richard Gomez; Dwight Helmick, Commissioner of California Highway Patrol, Defendants-Appellants, and Business, Transportation & Housing Agency, of the State of California; Maurice NMI Hannigan, in his official capacity as Commissioner of the California Highway Patrol, Defendants.
    No. 01-55312.
    United States Court of Appeals, Ninth Circuit.
    Argued and Submitted Jan. 23, 2002.
    Filed May 31, 2002.
    As Amended on Denial of Rehearing and Rehearing En Banc July 18, 2002.
    
      Joel M. Cohn, Esq., Akin, Gump, Strauss, Hauer & Feld, Washington, DC, for the defendants-appellants.
    Della Bahan, Esq., Bahan & Associates, Pasadena, CA, for the plaintiff-appellee.
    Before REINHARDT, KOZINSKI and HAWKINS, Circuit Judges.
   OPINION

REINHARDT, Circuit Judge.

This case has a long and difficult history. The allegations are serious; the plaintiffs, highway patrol officers who are members of racial minorities, allege that their employer, the California- Highway Patrol (“CHP”), maintains a promotional process that discriminates against them on the basis of race. The district court and this court have struggled with the issues involved, both because of the nature and importance of the allegations, and the complex legal and factual issues involved in racial discrimination cases.

In 1994, Jeff D. Paige, a black CHP lieutenant, brought this class action pursuant to 42 U.S.C. § 1983 on behalf of all current and future non-white CHP officers. He alleged that the promotional process of the CHP has a discriminatory impact on non-white officers in violation of Title VII of the 1964 Civil Rights Act. After a lengthy discovery period and a great number of submissions, the district court found that the CHP’s promotional process resulted in a disparate impact on the plaintiff class and, accordingly, granted partial summary judgment for the plaintiffs. The CHP appealed, and we reversed and remanded for further factual findings. See Paige v. State of California, 102 F.3d 1035 (9th Cir.1996); Paige v. State of California, No. 95-56669, 1996 WL 740839 (9th Cir. Dec.20, 1996) (unpublished memorandum disposition) [hereinafter collectively referred to as Paige / .]. Upon remand, the district court ordered additional discovery, and again granted partial summary judgment for the plaintiffs. The CHP again appealed, and again we consider the issues.

Background Facts

As of December 31, 1993 (the last reporting date prior to the filing of the lawsuit), the CHP employed approximately 5,675 officers, approximately 1,119, or 19.7%, of whom were officers of color. Only 102 of these non-white officers were in positions above the entry-level rank; they constituted approximately 11.1% of the total number of officers in supervisory positions. In contrast, non-white officers comprised approximately 20.9% of the CHP officers in non-supervisory positions.

In order to become a peace officer with the CHP, an applicant must first be hired as a cadet and complete a six month training program. Thereafter, the successful applicant is assigned to an entry-level peace officer position. An officer may subsequently apply for a promotion in rank and advance to sergeant, lieutenant, captain, assistant chief, and finally, to deputy chief. No outside hiring occurs beyond the entry-level position.

In order to be eligible for promotion, an officer must pass a promotional examination. There is a different examination process for each supervisory rank. For example, the exam for the sergeants rank includes an oral and written component whereas that for the position of assistant chief is oral only. Although certain ranks may share the same examination format, the exam for each rank is comprised of different questions and covers different exam topics. Written exams generally consist of a multiple-choice component and an essay component. The oral exam is labeled a “qualifications appraisal interview.” For the exams with a traditional written and oral component (sergeants and lieutenants), the applicants are notified by letter after the written portion that they either passed or failed. Only if they passed are they permitted to take the remainder of the exam.

When the promotional examinations are completed, an eligibility list is compiled. The eligibility list is a list of the successful applicants, ranked according to their “weighted composite score” on the completed exam. As vacancies in the supervisory ranks occur, officers are promoted in the order in which their names appear on the eligibility list. The list is used only until new exams are conducted at which time a new list is created.

Analysis

In order to make a prima facie case of “disparate impact” under Title VII, the plaintiffs must show '“that a facially neutral employment practice has a ‘significantly discriminatory’ impact upon a group protected by Title VII.” Moore v. Hughes Helicopters, Inc., 708 F.2d 475, 481 (9th Cir.1983); see also Wards Cove Packing Co. v. Atonio, 490 U.S. 642, 645-46, 109 S.Ct. 2115, 104 L.Ed.2d 733 (1989). This showing consists of two parts: the plaintiffs must demonstrate 1) a specific employment practice that 2) causes a significant discriminatory impact. Wards Cove Packing Co., 490 U.S. at 656-67, 109 S.Ct. 2115; Watson v. Fort Worth Bank & Trust, 487 U.S. 977, 994, 108 S.Ct. 2777, 101 L.Ed.2d 827 (1988). Statistical evidence is used to demonstrate how a particular employment practice causes a protected minority group to be under represented in a specific area of employment (for example, hiring or promotion). See Watson, 487 U.S. at 994, 108 S.Ct. 2777. The statistical analysis must show a disparity that is “sufficiently substantial” as to “raise such an inference of causation.” Id. at 995, 108 S.Ct. 2777.

In evaluating the impact of a particular process, we must compare the group that “enters” the process with the group that emerges from it. ' Here, the plaintiffs challenge several parts of the promotional process of the CHP as having a disparate impact on non-white officers. Ordinarily, we would analyze this claim by determining whether the group of officers that received promotions looks statistically different from the group that applied for such promotions — that is, we would compare the racial composition of the officers who are appointed to supervisory positions as a result of a challenged examination with the racial composition of the officers who applied for promotion to those positions. We have previously stated that “[t]he best evidence of discriminatory impact is proof that an employment practice selects members of a protected class ... in a proportion smaller than in the actual pool of eligible employees.” Moore, 708 F.2d at 482. However, we have also stated that this general principle is true only if there is not “a characteristic of the challenged selection device that makes use of the actual pool of applicants or eligible employees inappropriate.” Id.

Here, the parties do not dispute that the officers who successfully made it to the CHP supervisory ranks constitute one group for comparison. The parties do dispute, however, the group to which this first group should be compared (“the comparative group”). The defendants argue that we should conduct our statistical analysis as we ordinarily would in a promotion case and use an “internal pool” (the actual pool of promotional applicants) as the comparative group. In contrast, the plaintiffs argue that an internal pool is not appropriate because the CHP’s discrimination in hiring renders that pool an inaccurate reflection of the number of qualified nonwhite candidates who should be eligible, and would apply, for promotions. Accordingly, the plaintiffs argue that an “external pool” should be used — that is, a pool consisting of similarly skilled peace officers in the state of California.

Prior to Paige /, the district court concluded that an external rather than internal pool was. the appropriate comparative group to use in this case. On appeal, we held that the district court made insufficient factual findings to justify that conclusion. Accordingly, we remanded for further findings as to which type of pool was appropriate for use in this case and whether, using whichever pool was appropriate, a disparate impact had been demonstrated. After further discovery, the district court again found an external pool to be the appropriate comparative group, and again found a disparate impact. It found that the use of the internal pool was inappropriate because of barriers to the hiring process that resulted in an under representation of non-whites in the group eligible to apply for promotions.

Much of this case’s history has concentrated on the debate over the use of an external versus internal pool. We may have inadvertently contributed to this diversion from the controlling issues. In Paige /, in holding that the use of the external pool was unjustified on the record before us, we stated that “[tjhere was no evidence offered that the CHP discriminated in hiring, or of any employment practice that deterred non-white, non-supervisory employees from applying for promotion.” Paige I, 1996 WL 740839, at *2. The parties and the district, court then devoted their efforts in large part to examining whether the CHP engaged in discriminatory hiring at the entry-level and assessing the appropriateness of using an external pool. However, at the time of Paige I, the district court had not yet made a critical ruling that turns out to be determinative of the appropriate compara-five group issue. Following our remand in Paige I, the district court held that, in light of the limited nature of their pleadings, the plaintiffs lacked standing to challenge the “closed promotion process” of the CHP. This ruling, which plaintiffs do not contest on appeal, leads to the inevitable conclusion that the use of an external pool in this case is inappropriate.

The district court’s standing ruling was issued in response to the plaintiffs’ efforts to invalidate the CHP’s practice, pursuant to Cal. Veh.Code § 2251, of promoting only from within, as opposed to allowing officers from other police departments to apply for CHP supervisory positions. Obviously, as the district court noted, hiring only from within could not harm the plaintiffs, who are all members or future members of the CHP. In fact, prohibiting outside competition and thereby limiting the numbers of persons eligible to seek supervisory positions to those in the plaintiff class could only benefit the class members and improve their chances for promotion. As the district court explained, “[pjlaintiffs are not subject to the limitations of § 2251 because, as members of the CHP, they are eligible to apply for supervisory vacancies.” It therefore found that they lacked standing to challenge the practice. In short, the district court held that the plaintiffs may only challenge the aspects of the promotional process that adversely affect them, and that a closed promotional process causes them no injury.

The use of an external pool might well be appropriate in a case in which the plaintiffs are free to challenge the validity of a ■ closed promotional process and to seek relief in the form of an order requiring that candidates both from within and without an employer’s ranks be allowed to compete for supervisory vacancies. In that circumstance, the court would have to determine whether the employer’s closed promotional process served to limit disproportionately the number of non-whites who obtain supervisory positions with the employer, and accordingly, the comparative group would consist of all potential applicants. Where, however, as here, the plaintiffs are by class definition all members of the CHP who may only contest the aspects of the closed promotional process that adversely affect their own opportunities for promotion, the only relevant question is whether the examination process adversely affects non-white as compared to white CHP officers. In other words, does' the challenged practice favor white officers over non-white officers? To answer this question, an internal pool — the pool of all officers who apply for promotion — constitutes the appropriate comparative group.

In sum, we hold that an internal pool is the proper comparative group for use in determining whether the various parts ■ of the CHP’s promotional process have a disparate impact on non-white officers. Although this holding requires us to reverse the district court’s grant of summary judgment for the plaintiffs, it does not require us to direct that summary judgment be awarded to the defendants. Instead, we must ask which parts of the promotional process, if any, have a disparate impact on non-white officers. This can only be answered by employing a statistical analysis based on the proper comparative group. Because the parties offer conflicting inferences drawn from the evidence they each offered as to whether the use of an internal pool shows such a discriminatory effect, there is a genuine dispute of material fact as to whether the promotional exams have a disparate impact, and this action must proceed to trial.

There are, however, questions of law bound up in these factual questions. In order to give guidance to the parties and the district court, and because this case was filed eight years ago and has been before this court twice already, we address these issues now. The defendants challenge the validity of the plaintiffs’ statistical analyses in three ways: First, the defendants argue that the plaintiffs improperly aggregate data from the various written examinations; second, they contend that the plaintiffs erroneously group all non-white officers together rather than analyzing the data according to each separate minority group; and third, the defendants argue that the plaintiffs may not include data from before the beginning of the liability period.

With regard to the first two arguments against the aggregation of data, it is a generally accepted principle that aggregated statistical data may be used where it is more probative than subdivided data. See Eldredge v. Carpenters 46 N. Cal. Counties Joint Apprenticeship and Training Comm., 833 F.2d 1334, 1339-40 nn. 7 & 8 (9th Cir.1987). Such use is particularly appropriate where small sample size may distort the statistical analysis and may render any findings not statistically probative. See Int’l Brotherhood of Teamsters v. United States, 431 U.S. 324, 340 n. 20, 97 S.Ct. 1843, 52 L.Ed.2d 396 (1977); Stout v. Potter, 276 F.3d 1118, 1123 (9th Cir.2002). Indeed, some commentators suggest that stratification should be upheld only if the employer can demonstrate that “the stratification is appropriate, and that the stratifying variable is business justified.” Ramona L. Paetzold & Steven L. Willborn, The Statistics of Discrimination § 5.08 at 35 (1996 & Supp.2001).

The defendants first argue that the plaintiffs improperly group data from the various written exams for the supervisory ranks. The district court found aggregation of the supervisory positions to be more probative than subdivided data. We agree. The plaintiffs demonstrated, by pointing to the State Personnel Reports, the Census reports, and the CHP’s own descriptions of the positions, sufficient commonality among the duties and skills required by the various supervisory positions to justify aggregation. The defendants themselves group the supervisory positions together for purposes of their reports to the State Personnel Board and the EEOC. Additionally, aggregation of similar employment positions has been done in other Title VII race discrimination cases. See Hazelwood, 433 U.S. at 308 n. 13, 97 S.Ct. 2736 (aggregating various types of public school teacher positions for purposes of analysis); Simmons v. City of Kansas City, No. CIV.A.88-2603-0, 1992 WL 403096, at *6 (D.Kan. Dee.7, 1992) (upholding statistical analysis that grouped testing procedures for all supervisory ranks of police department).

Second, the defendants contest the district court’s ruling that aggregating non-white officers resulted in more probative data than would dividing them according to individual minority groups. Again, we agree with the district court. In general, “the plaintiff should not be required to disaggregate the data into subgroups which are smaller than the groups which may be presumed to have been similarly situated and affected by common policies.” Eldredge, 833 F.2d at 1340 n. 8 (quoting D. Baldus & J. Cole, Statistical Proof of Discrimination § 7.0-7.2 (1980 & 1986 Supp.)). Here, plaintiffs’ theory is that the employment practices have the identical discriminatory effect upon members of all minority groups, and that those practices unlawfully benefit solely the members of the white majority. Right or wrong, they are entitled to attempt to prove their case. In addition, the defendants have not put forth any evidence that certain minority groups should be evaluated differently than others or that the promotional process has treated particular minority groups differently. Cf. Rich v. Martin Marietta Corp., 522 F.2d 333, 346 (10th Cir.1975) (holding that inclusion of “Orientals and American Indians” was not appropriate because they were not similarly situated in terms of numbers in upper echelon of the labor force). Finally, the defendants have not presented any evidence that suggests that the stratification of data on the basis of particular minority groups is business justified.

The defendants’ final argument regarding plaintiffs’ statistics is that the plaintiffs erroneously rely on data taken from examinations and corresponding eligibility lists that expired before the start of the liability period. However, we have held that it is appropriate to admit pre-liability data into evidence in a disparate impact case if promotional practices remain similar over a long period of time, as they have in this case. See, e.g., Bouman v. Block, 940 F.2d 1211, 1226 (9th Cir.1991) (allowing plaintiffs in a Title VII disparate impact case to aggregate results from multiple promotional exams because “courts have repeatedly looked at trends from past examinations to see if the total pass rate showed evidence of discrimination”); Eldredge, 833 F.2d at 1339 n. 7 (finding data in a Title VII disparate impact case “more complete and reliable” when aggregated over nine years than when shown year-by-year); see also 28 C.F.R. § 50.14 at § 4(d) (1977) (stating that in a Title VII disparate impact case, when the data indicate that a procedure has discriminatory impact, but the data set is too small to be rehable, plaintiffs may supplement data with evidence concerning the impact of the procedure over a longer period of time). We agree with the district court that pre-liability period data may properly be included in the statistical analysis in this case.

In sum, we reject the defendants’ three arguments contesting certain aspects of the plaintiffs’ statistical analyses. The plaintiffs may aggregate data from the various supervisory written examinations and may group all non-white officers together., for comparison purposes. Additionally, “pre-liability period” data may be included in the statistical analyses. The defendants are of course equally free to introduce whatever data they may deem to be appropriate, subject to the further rulings of the district court.

The defendants also argued before the district court that even if its promotional process did have a disparate impact on non-white officers, the process “fulfills ] a genuine business need” and is therefore job related and permissible under Title VII. Griggs v. Duke Power Co., 401 U.S. 424, 432, 91 S.Ct. 849, 28 L.Ed.2d 158 (1971); see also 42 U.S.C. § 2000e-2(k)(l)(A)(i). We agree with the district court that the CHP failed to show an appropriate validation of the promotional process as a whole or of its individual parts, in part because it did not present any evidence regarding how the examinations • actually test for the skills identified by the CHP as critical to performing well in a particular supervisory rank. See Griggs, 401 U.S. at 432, 91 S.Ct. 849; Assoc. of Mexican-American Educators v. State of California, 231 F.3d 572, 585 (9th Cir.2000) (en banc); Bowman, 940 F.2d at 1228. Therefore, on the record before us, the defendants have failed to meet their burden of production demonstrating that the promotional process is “job related.”

Because we hold that an external pool is not the appropriate comparative group for determining disparate impact, the summary judgment in favor of the plaintiffs cannot stand. We also hold that the defendants have failed to demonstrate that the promotional process at issue fulfills a genuine business need; thus, they also are not entitled to summary judgment. Genuine issues of material fact remain. We therefore remand for trial at which time the parties may submit appropriate statistical and other evidence in light of our holding. We also vacate the injunction issued by the district court because it was based in part on its granting of summary judgment to the plaintiffs and its finding of disparate impact using an external pool. We express no view as to whether injunc-tive relief would otherwise be appropriate.

REVERSED AND REMANDED. 
      
      . During the period relevant to this lawsuit, the examinations were given approximately eveiy two years. The questions differed each time the exams were administered. Although individuals who reach the higher supervisory ranks may apply for the position of Chief, there is no promotional exam for that position, and the record does not reflect whether outside candidates are eligible to apply.
     
      
      . During the relevant period of time for this lawsuit, the sergeants and lieutenants promotional exams were comprised of a written and oral component; the captains exam consisted of an "in-basket” exercise and an oral interview; and the assistant chiefs and deputy chiefs exams consisted of oral interviews only. An "in-basket” exam is a written exam in which the applicants write answers to hypothetical problems that they may confront in their future employment.
     
      
      .We review the district court’s conclusion as to a prima facie case of disparate impact de novo but review the underlying findings of fact for clear error. See Tiano v. Dillard Dept. Stores, Inc., 139 F.3d 679, 681 (9th Cir.1998). In particular, we give deference to a district court’s findings regarding statistical evidence. See Hazelwood Sch. Dist. v. United States, 433 U.S. 299, 312, 97 S.Ct. 2736, 53 L.Ed.2d 768 (1977).
     
      
      . A plaintiff need not prove discrimination with "scientific certainty”; he must, however, prove any such charge by a preponderance of the evidence. Eldredge v. Carpenters 46 N. Cal. Counties Joint Apprenticeship and Training Comm., 833 F.2d 1334, 1338 (9th Cir.1987) (quoting Bazemore v. Friday, 478 U.S. 385, 106 S.Ct. 3000, 92 L.Ed.2d 315 (1986)).
     
      
      . The "external pool” is defined by the categories of the 1990 California Census that correspond to the similarly skilled positions in the CHP. According to the census, the statewide percentage of non-white "Supervisors, Police and Detectives” (census code 414 — • supervisory positions) is 23.1%. The statewide percentage of non-white "Police and Detectives, Public Service” (census code 418— non-supervisory positions) is 28.8%. The district court adopted the 1990 California Census codes as the appropriate measurements of qualified non-white applicants for the supervisory CHP positions.
      With regard to the internal pool, of the 2,842 officers who applied to be promoted, 572, or 20.1%, were non-white. Thus, the difference between using an internal and an external pool is that in the former case, non-whites would constitute 20.1% of the total number of (actual) applicants and in the latter, 28.8% of the total number of (potential) applicants. The internal pool would, of course, vary depending on the aspect of the promotional process challenged. For example, if the analysis focused solely on the sergeants written exam, the internal pool would consist of the officers who applied for promotion to the sergeants rank.
     
      
      . The plaintiffs have not offered any evidence to suggest that non-white officers who are eligible for promotion are dissuaded from actually applying for these promotions. Therefore, the pool of actual applicants is appropriate for statistical analysis as opposed to using the pool of officers who are eligible to apply.
     
      
      . The fact that the plaintiffs in this case are required to contest particular aspects of the promotional process comports with the general rule that, unless the factors are incapable of separation, a plaintiff must demonstrate that a particular factor or aspect of a deci-sionmaking process has a disparate impact. See Watson, 487 U.S. at 994, 108 S.Ct. 2777; Stout v. Potter, 276 F.3d 1118, 1124 (9th Cir. 2002) (stating that plaintiffs generally cannot attack overall decisionmaking process, rather they must identify a particular element or practice); see also 42 U.S.C. §§ 2000e-2(k)(l)(A)(i) and (B)(i). But see Powers v. Alabama Dept. of Educ., 854 F.2d 1285, 1293 (11th Cir.1988) (''[P]laintiffs alleging class-wide discrimination in promotions are not required to isolate the particular aspect of the promotional process that is responsible for the discriminatory impact.”).
     
      
      .It appears from the district court's order that the judge may have already considered some of the plaintiffs’ data in which they used an internal pool for their statistical analysis. The district court stated that it granted partial summary judgment for the plaintiffs "on the issue of disparate impact resulting from the use of written examination for promotional eligibility.” However, such a conclusion would not be possible using an external pool because the external pool data is not subdivided according to various types of examinations. Nevertheless, the district court explicitly stated that an external pool was the appropriate measure for determining disparate impact in this case. If only because of this internal inconsistency, we cannot affirm the district court’s grant of summary judgment in favor of the plaintiffs on the alternate ground that it is supported by statistical data based on the use of an internal pool.
     
      
      . The defendants also contest categorizing the “in-basket” captains exam as a written examination. However, the defendants’ own witness, the Commander of the CHP’s Selections and Examination Section, made clear that the in-basket exam is a written exam. He described the captains examination process as follows: "Candidates are required to respond in writing to the problems contained in the in-basket exercise. These written responses are evaluated for technical content and writing proficiency by panel members of the. qualifications appraisal oral interview that follows the in-basket exercise.” Thus, it is clear that the in-basket exam qualifies as a written exam.
     
      
      . As additional support for the aggregation of the statistics concerning non-white officers, we note that the class was certified as a class of "non-white” officers and the CHP did not contest this aspect of the certification. See Paige I, 102 F.3d at 1042 (affirming use of class-based action and stating how case would look at use of race in all applications for promotions); see also Wards Cove, 490 U.S. at 647, 109 S.Ct. 2115 (action in which plaintiff class comprised of all non-white workers); Domingo v. New England Fish Co., 727 F.2d 1429, 1442 (9th Cir.1984) (same).
     
      
      . On summary judgment, the plaintiffs presented statistical analyses using an internal pool in support of their contentions as to the disparate impact on non-white officers of (a) the sergeants written examination standing alone; (b) the sergeants, lieutenants, and captains written examinations combined; and (c) the use of a weighted composite score for determining the eligibility lists for the sergeants rank. Upon remand, the plaintiffs are not limited to challenging these aspects of the promotional process but may offer appropriate statistical evidence regarding any of the classifications or examinations involved.
     