
    Robert Earl MISTER, on behalf of himself and all others similarly situated, Plaintiffs-Appellants, v. ILLINOIS CENTRAL GULF RAILROAD COMPANY, Defendant-Appellee.
    No. 86-2381.
    United States Court of Appeals, Seventh Circuit.
    Argued Sept. 22, 1987.
    Decided Oct. 29, 1987.
    Rehearing and Rehearing En Banc Denied Dec. 7, 1987.
    
      Jerome J. Schlichter, Carr, Korein, Schli-chter, Kunin & Mongroy, East St. Louis, Ill., Jeffrey C. Bannon, E.E.O.C., Washington, D.C., for plaintiffs-appellants.
    Richard P. Nash, Gundlach, Lee, Eggmann, Boyle & Roessler, Belleville, Ill., for defendant-appellee.
    
      Before WOOD, FLAUM, and EASTERBROOK, Circuit Judges.
   EASTERBROOK, Circuit Judge.

The Illinois Central Gulf Railroad (ICG) needs many stong backs. Would-be laborers apply to a central office, where a clerk hires from among those whose applications are less than six months old. During 1979 and 1980 the ICG had eight divisions, autonomous in hiring. The hiring office of the St. Louis division was in Carbondale, Illinois, where a single clerk hired most laborers. During 1979 the Carbondale office hired 11% of the black applicants for laborer jobs (including applicants with no preference for type of job) and 39% of the white applicants for these positions. In 1980 the railroad hired no black applicants and 6.1% of the white applicants. We have the ensuing class action, filed under Title VII of the Civil Rights . Act of 1964, 42 U.S.C. § 2000e et seq., by one of 1979’s disappointed black applicants.

The district court certified a class of applicants for laborers’ positions in the ICG’s St. Louis division between December 8, 1978, and the filing of the case in early 1981. After a lengthy trial, the court entered judgment for the ICG. 639 F.Supp. 1560 (S.D.Ill.1986). The court concluded that the class had established a prima facie case of discrimination under both disparate treatment and disparate impact approaches, but that the ICG had demonstrated that a neutral rule — the desire to hire laborers who lived close to work — not only accounted for the disparity but also was supported by business necessity (a requirement in the disparate impact portion of the case).

The applicant class contends on appeal that the ICG has not established that it had a policy of hiring the applicants who lived closest to their places of work, let alone that this policy produced the racially disparate effects or was justified if it did. The ICG, for its part, attacks the district court’s conclusion that the class made out a prima facie case; it maintains that it did not need to offer any justification at all. We conclude, with the district court, that the class established a high probability of disparate treatment requiring the ICG to offer a race-neutral explanation for its hiring. Cf. Furnco Construction Corp. v. Waters, 438 U.S. 567, 98 S.Ct. 2943, 57 L.Ed.2d 957 (1978); Teamsters v. United States, 431 U.S. 324, 97 S.Ct. 1843, 52 L.Ed.2d 396 (1977). The ICG’s explanation does not explain anything, however, because the ICG never showed that its preference for short commutes accounted for the racial disparity. The district court’s contrary conclusion is clearly erroneous. See Anderson v. Bessemer City, 470 U.S. 564, 573-75, 105 S.Ct. 1504, 1511-12, 84 L.Ed.2d 518 (1985); Pullman-Standard v. Swint, 456 U.S. 273, 287-90, 102 S.Ct. 1781, 1789-91, 72 L.Ed.2d 66 (1982). The class therefore prevails on its disparate treatment theory. We do not decide whether the disparate impact theory — here principally a challenge to a “bottom line” produced by “subjective” hiring — would have supported any relief. See Watson v. Fort Worth Bank & Trust, 798 F.2d 791 (5th Cir.1986), cert. granted, — U.S. -, 107 S.Ct. 3227, 97 L.Ed.2d 734 (1987), which presents one of the many tough questions about using the disparate impact method to challenge subjective decisions, questions that we shall avoid. See also, e.g., Griffin v. Board of Regents, 795 F.2d 1281, 1288 n. 14 (7th Cir.1986) (reserving questions about the application of the disparate impact model to subjective decisionmaking).

I

The disparate treatment claim is that the ICG considered race when making hiring decisions. The class offered several pieces of evidence that, the district court believed, established a presumptive (prima facie) case on this score. (1) The railroad hired a much larger portion of white applicants than of black applicants. The plaintiffs’ expert witness testified that there was less than one chance in a million that this disparity was consistent with race-neutral hiring. (2) The railroad hired more than 500 laborers in 1979, including 11 for work in East St. Louis. About 150 people from East St. Louis, most of them black, applied for work on March 27, 1979, while the ICG had more than 100 laborer jobs open on a repair project. ■ The ICG did not hire a one of these applicants, though when it hired laborers to work in East St. Louis itself it turned outside that city (sometimes far outside it) to hire white laborers. (3) Although the railroad argued that its desire to hire local labor explained the disparate consequences, a survey of its employment records showed that 122 laborers holding non-mobile jobs (that is, jobs with a single place of work) lived more than 50 miles from their job sites. All but three of these laborers are white.

This combination powerfully suggests that the Carbondale office of the ICG had race on its mind when hiring laborers in 1979 and 1980. It is hard to imagine a stronger case, short of an announcement of discrimination. The ICG attacks only the statistical component of this showing, on two grounds: bad data and inaccurate assumptions about the labor market. The first is unavailing. The plaintiffs’ expert used the best data available; that the data were not better is the ICG’s fault, and we agree with the district court that the data at hand were good enough even though imperfect. 639 F.Supp. at 1573-75.

The argument about the labor market goes to the heart of the case. The plaintiffs' expert assumed that the St. Louis division of the ICG is a single labor market. This meant that every applicant was interested in every job, and the railroad potentially interested in every applicant for every job. Yet the St. Louis division extended more than 200 miles north to south, from central Illinois to western Tennessee, and as the ICG points out it is unlikely that applicants from one end of the division really wanted to work at the other. It adds that it was not interested in people who lived far from the places that needed laborers. Only an analysis of many geographically smaller labor markets, the ICG insists, could support a valid conclusion about its hiring processes.

The district court did not agree with this argument, and neither do we. The ICG used one hiring office for the entire division and maintained one set of employment records. It was not possible to construct geographic markets out of preferences expressed in the applications, for there were none. The forms the railroad provided to applicants did not ask them where in the division they wanted to work Or suggest that they had any say in the matter. Anyway, the expert witness did not need to examine smaller labor markets unless the geographic distribution of applicants by race was uneven. If black and white applicants throughout the ICG’s St. Louis division lived (on average) equal distances from the ICG’s work sites, then the assumption that there was only one labor market made no difference. Smaller labor markets would mirror the division as a whole; if the ICG’s performance were better in some, it was bound to be worse in others. Only non-uniform distribution would make the study inappropriate. If, say, the black applicants predominantly lived in the center of the district while the work was located at the far north and far south, then the plaintiffs’ assumption would invalidate the results. The study was enough to cast on the railroad the burden of showing that the applicants were not homogeneously distributed by race with respect to places of work. But the ICG never showed that there was such a non-uniform distribution of applicants-indeed, as we conclude below, the railroad did not show even that distance from work mattered much.

The court (and therefore the expert witness) wants to test hypotheses relevant to the litigation. The witness formulates the hypothesis (such as: “the ICG was hiring without regard to race” or “the ICG was hiring labor from nearby”) and uses this hypothesis to make predictions. Then the expert examines the data to see whether they confirm the predictions and thereby support the hypothesis. Much judicial discussion about statistical evidence labors over questions whether findings are reliable enough to confirm a prediction and support or reject the hypothesis. That is, how confident can we be that chance or irrelevant details did not account for the pattern we observe in the data? Unless a chance outcome is unlikely, the study does not confirm or refute the hypothesis. Drawing from offhand footnotes in two cases, Hazelwood School District v. United States, 433 U.S. 299, 308-09 n. 14, 311 n. 17, 97 S.Ct. 2736, 2741-42 n. 14, 2743, n. 17, 53 L.Ed.2d 768 (1977), and Castaneda v. Partida, 430 U.S. 482, 496-97 n. 17, 97 S.Ct. 1272, 1281-82, n. 17, 51 L.Ed.2d 498 (1977), courts have produced a “rule” that findings should be accepted as “statistically significant” when the observations are more than two standard deviations away from the values predicted if the null hypothesis (no discrimination) holds. E.g., Palmer v. Shultz, 815 F.2d 84, 90-97 (D.C.Cir.1987) (summarizing cases and literature). This rule reduces the number of “Type One errors” (false findings of discrimination) at the potential cost of an increase in “Type Two errors” (finding no discrimination when there is some). Courts do this in part because departures from “random” hiring may arise because of factors other than race, factors hard to capture in the statistical models. See Paul Meier, Jerome Sacks & Sandy L. Zabell, What Happened in Hazelwood: Statistics, Employment Discrimination, and the 80% Rule, 1984 Am.Bar Foundation Research J. 139, 158-70. Having decided to make tests of “statistical significance” the measure of legal significance as well, judges have offered views on the differences for this purpose between one-tailed and two-tailed tests and between Z-statistics and t-statistics. E.g., Griffin, 795 F.2d at 1290-91 & n. 18; Coates v. Johnson & Johnson, 756 F.2d 524, 536 n. 11 (7th Cir.1985). Yet the finer points of significance-testing are pertinent only if the analyst has formulated the hypothesis correctly and decided what pattern (if established) will confirm or refute that hypothesis. The first, hardest, and often the only task is deciding what to' look for. See Meier, Sacks & Zabell; Thomas J. Campbell, Regression Analysis in Title VII Cases, 36 Stan.L. Rev. 1299 (1984); Daniel L. Rubinfeld, Econometrics in the Courtroom, 85 Colum.L. Rev. 1048 (1985). That the 1979 hiring ratios (11% of black applicants, 39% of white applicants) are more than eight standard deviations from the result the expert took as his null hypothesis is important only if the test has been specified correctly.

The ICG’s argument about labor markets is an effort to persuade us that the plaintiffs’ expert specified the hypothesis incorrectly. The plaintiffs replied that the ICG’s decision to hire from Carbondale shows that there was a single labor market. Neither side asked whether the specification of the test adequately accounted for the hiring decision. Plaintiffs’ statistical work leaves much to be desired. The plaintiffs’ expert used no independent variables other than race. He assumed, in other words, that all applicants are identical in every respect except race. This is not necessarily true. For laborers’ jobs, other variables matter—physical condition (for which age and a weight/height ratio may be proxies) and employment history (has the person been fired from other jobs or convicted of job-related offenses?) are important to any employer. The omission of these variables weakens the plaintiffs’ case by leaving open the possibility that important, non-racial variables account for the hiring decisions. But the ICG does not suggest that these things account for the pattern of its hiring. The ICG’s expert replicated the findings of the plaintiffs’ expert, and the ICG does not offer any explanation other than distance from work for the startling disparity in rates of hiring. We are not about to discount the plaintiffs’ statistical work on grounds that the employer, with the best access to data, chose not to raise. We therefore conclude, with the district court, that the plaintiff class made out a presumptive case of disparate treatment.

II

The railroad’s response to the plaintiffs’ evidence is to say that distance accounts for everything. The district court found that the ICG had a preference for people who lived close to work (where “close” meant less than 50 miles), though not a hard-and-fast rule. We accept this finding. The ICG then asserts that black applicants lived farther from the ICG’s places of work. If this disparity in distance produced the disparity in hiring, the ICG would have a sufficient answer to the disparate treatment claim. (Whether it would have a sufficient reply to the disparate impact claim we need not say.) The essential ingredient of the ICG’s case was proof that black applicants did live farther from the worksites, enough so to account for the different rates of hiring.

The ICG had a straightforward way to establish its claim. Applications for employment gave the applicants’ addresses. The ICG could have measured the distance between the residences of “active” applicants (those who applied within six months) and the places for which it was hiring at a given moment. If the data showed that it hired the applicants who lived close by, without regard to race — that is, if black and white applicants who lived 10 miles from work had equal probabilities of being hired, and applicants who lived 10 miles from the site had higher probabilities than applicants who lived farther away — it would have had a good reply to the plaintiffs’ case. But the ICG did not conduct this inquiry. Its expert regarded the applicant data as incomplete and had trouble matching the applications active at a given moment against jobs then available. See 639 F.Supp. at 1570-71. So the ICG computed the distance from work only for some 80% of the people it actually hired.

This made it difficult, though not impossible, to test the ICG’s hypothesis. Showing (as the ICG did) that 80% of the laborers for which it compiled data live within 50 miles of their assigned station does not prove that the ICG used any sort of distance test in hiring. People generally want jobs close to home, and laborers especially are unlikely to drive more than 100 miles round trip daily for the wage being offered. An employer that paid no attention to the residence of its applicants likely would find that 80% or more of its employees lived within 50 miles of work; self-selection would see to that. Furthermore, the observation that 80% of those hired lived within 50 miles of work is consistent with the hypothesis that 80% of those not hired lived within 50 miles of the ICG’s work-sites. It is consistent, indeed, with the proposition that black applicants lived closer to the jobsites than white applicants, and that had the ICG hired more blacks it would have had a more compact labor force.

The hypotheses that black applicants live farther from the ICG’s yards than white applicants and that the ICG tried to minimize the commuting distance have some testable implications, however — even if the data are limited to people actually hired. One is that the white laborers live closer to work than the black laborers do. If, for example, white applicants live (on average) 30 miles from the ICG’s worksites, while black applicants live 60 miles away, and if the distributions around these means are normal, then the ICG will be hiring from the “tail” of the black applicant pool and the center of the distribution of the white applicant pool. The whites hired will live closer to work than the blacks hired; the distribution of commuting distances for white employees also will be more compact. (That is, the standard deviation of distances from work for white employees will be lower than the standard deviation for black employees.) Although the ICG hired as its expert witness a statistician who had written extensively on the subject^ the ICG either neglected to test for these implications or elected not to inform the district court of its findings.

The ICG’s expert introduced a number of tabulations that have allowed us to determine the averages, though with some imprecision because we used only the aggregate data the expert compiled. Black laborers hired for non-mobile jobs lived only half as far from their worksites as white laborers. The distribution of commuting distances for black laborers was much more compact.

1976-1980 Blacks Whites 1979-1980 Blacks Whites

Mean distance from work, miles 14.2 29.4 16.5 30.7

Standard deviation of distance from work, miles 15.9 33.7 18.8 28.6

The ICG’s expert did not have data for all of the ICG’s employees, so we used a standard test of statistical significance to inquire whether the distances (and distributions) of white and black laborers might be the same, appearing different only because of the sampling from the pool of employees. The values of the t-test exceed 4.0 for both sets of years, which implies a probability less than one in a thousand. So far the ICG’s data support the plaintiffs’ case rather than the ICG’s.

There is a related implication: if black applicants live farther from the jobsites than white applicants, then the “exceptions” to the ICG’s desire to minimize distance should favor black applicants, who will be overrepresented in the pool eligible for exceptional treatment. The St. Louis division had 122 non-traveling laborers who lived more than 50 miles from work. All but three were white. The ICG has no explanation for this datum, which was introduced at trial and is inconsistent with its contention that black applicants lived farther from work than white applicants.

It might be possible to explain these findings by showing that the ICG had some rural worksites, with predominantly white workers commuting long distances, and some urban worksites with black laborers commuting short distances; the laborers commuting more than 50 miles also would be concentrated at the rural sites. But the ICG did not suggest that this accurately represents its employment picture, and none of its sites had a labor force more than 30% black, which seems to exclude the possibility we mention. (The largest urban area in the district, St. Louis and East St. Louis, easily could produce a labor force more than 30% black, and the white workers at these sites would not have commuting distances longer than the black workers.)

The ICG’s statistical expert took a different approach altogether. See 639 F.Supp. at 1571-73, 1577-79. He obtained Census data about the labor force in each county within the ICG’s St. Louis division and excluded professional, technical, and sales workers, arriving at a “diminished civilian labor force.” The exclusion was appropriate because the ICG did not hire such people as laborers. The expert calculated the number (and percentage) of black and white employees in each county’s “diminished civilian labor force.” Then he determined the number of people actually hired from each county, by race, and concluded that the number of black laborers the ICG hired was less than two standard deviations away from the number that would be predicted by race-blind hiring from the “diminished civilian labor force.” This approach disregarded the distance of counties from the ICG’s worksites and therefore cannot establish that distance mattered. In an effort to determine the effects of distance, the expert also gave each county a “weight” based on the number of people the ICG hired from there; so, for example, if two counties of equal “diminished civilian labor force” produced 100 and 10 laborers — presumably because one county was farther from the ICG’s worksites than the other — the expert would give the second county a weight of 0.10 relative to the first. Then he applied this weight to compute an adjusted, “diminished civilian labor force” and to predict the number of laborers the ICG should have hired by race and county. Using this method he concluded that the ICG had hired more black laborers than would be expected from race-blind selection.

The district court was not convinced by this evidence, 639 F.Supp. at 1577-79, and we are not either. The unweighted studies have nothing to do with the distance of potential workers (and their counties) from the ICG’s places of work; they call into question the implications of the plaintiffs’ applicant flow studies but do not verify the ICG’s distance defense. The weighted portion of the study, which the ICG believes cures that defect, begs the question. It assumes what is to be proven — that differences in hiring are attributable to distance rather than race. The expert derived the weights from the ICG’s hiring, rather than from distance. So if the ICG hired one person from an all-black county with a labor force of 100 and 10 people from an all-white county with a labor force of 100, the expert would assign the black county a weight of 0.10 and deduce that the ICG was “expected” to hire only one person from it — even if the black county were closer to the ICG’s places of work than the white county. Voila! Discrimination itself produces a statistical finding of no discrimination.

We have used an extreme hypothetical, but the problem is real. Suppose there are two counties, equal in size and distance to the ICG’s worksites. If one county’s labor force is 30% black and the other county’s labor force 10% black, race-blind hiring from these two counties as a unit should produce a labor force 20% black (if people apply for work in proportion to their numbers). Suppose, however, the ICG hires only from the second county (with 90% white laborers). The first county will receive a weight of zero, and if the ICG’s hires are 10% black the study will conclude that there has been no discrimination. This is an artifact of the way in which the hypothesis was formulated. We need not go into the details of statistical significance to see that the results are unreliable. The problem is more than hypothetical. The plaintiffs offered evidence that some counties, with relatively large black labor forces, had been skipped over while the ICG hired from other, “whiter” counties farther from its workplaces. Four counties in the St. Louis division have “diminished civilian labor forces” more than 20% black. The ICG hired significant numbers of employees from only one of these counties, while hiring larger numbers from adjacent counties.

So the defendants’ statistical work simply does not show that distance accounts for the racial differences in the rate of hiring. The district court found that the ICG used distance as a ground of decision and apparently believed that this finding carried the day for the railroad on the disparate treatment claim. But it does not; there must be a causal link between criterion and consequence. Only that causal link would compel the plaintiff class to demonstrate (in the disparate treatment portion of the litigation) that the railroad was using distance to get at race (because of, not just in spite of, its disparate consequence). The only piece of evidence providing the causal link was the statistical study, which the district court set aside (correctly, we have held). That left the ICG’s burden of production unsatisfied.

This is so even though we have considered only the plaintiffs’ disparate treatment theory. Under Texas Department of Community Affairs v. Burdine, 450 U.S. 248, 101 S.Ct. 1089, 67 L.Ed.2d 207 (1981), once the plaintiff establishes a prima facie case the employer need only “articulate” rather than “prove” a non-discriminatory explanation for its conduct. The plaintiff then must show that the employer intentionally used race as a ground of decision, and that this caused injury. Burdine, 450 U.S. at 254-56, 101 S.Ct. at 1094-95; Pollard v. Rea Magnet Wire Co., 824 F.2d 557 (7th Cir.1987). We may assume that this allocation of burdens applies even in pattern-or-practice litigation, although it has much in common with disparate impact litigation, as Teamsters and Fumco show. The pattem-or-practice case starts with a stronger showing than the individual disparate treatment case; the “prima facie case” under McDonnell Douglas Corp. v. Green, 411 U.S. 792, 93 S.Ct. 1817, 36 L.Ed.2d 668 (1973), supports only a weak inference of discrimination, while the statistical showing in a pattem-or-practice case leaves a smaller possibility of race-neutral conduct. This suggests that Burdine’s allocation of burdens may be inappropriate in pattem-or-practice litigation. Even if that allocation applies — as we assume today— what the employer must “articulate” is an explanation of its activity. Burdine, 450 U.S. at 254-55, 101 S.Ct. at 1094-95.

When the defendant pronounces a reason unrelated to the plaintiff (“a midget can’t do the job”, followed by silence on the plaintiff's height), it has not adequately articulated a neutral reason within the meaning of Burdine. The employer’s burden of production means that it must introduce facts sufficient in principle to explain what happened. The ICG did not need to “prove” the validity of its explanation, but it did need to give one. In a pattern-or-practice case like this the employer need not show that distance accounted for the decision concerning any particular applicant, or that distance was important to rail operations (“business necessity”); it had at least to produce evidence that black and white applicants lived different distances from work.

Just in case the evidence was sufficient to satisfy the burden of production, we add that the explanation was pretextual. The definition of a pretext, which we discussed in Pollard, is a statement that does not describe the actual reasons for the decision. The employer need not have “good” reasons, and a mistaken business decision is not on that account a “pretext”. See also, e.g., Friedel v. City of Madison, 832 F.2d 965, 973-76 (7th Cir.1987). But an explanation that does not convey the motivation for the decision is a pretext, and if the hypothesis of racial motive best explains the actions, then also a “pretext for discrimination”. Cf. Benzies v. Illinois Department of Mental Health, 810 F.2d 146, 148 (7th Cir.1987). It is not likely that the ICG thought that blacks’ greater distance from work explained anything. The data so explode the ICG’s explanation that they also dissipate any possibility of good-faith mistake (i.e., the possibility that some explanation other than either race or distance explains the pattern). The ICG’s explanation destroyed, the plaintiffs may prevail on their original case, especially one of this strength. Cf. Graefenhain v. Pabst Brewing Co., 827 F.2d 13, 17-18 (7th Cir.1987) (same conclusion under the ADEA, even on a weaker case). The only explanations that have been offered for the disparity (and the only ones we can imagine in the absence of evidence that the black applicants had some systematic difference in qualifications from the white applicants) are discrimination and distance. To debunk the explanation based on distance is to compel the acceptance of the explanation from discrimination.

There remains only one possibility: that the ICG’s studies, although failures as attempts to relate hiring to distance, undercut the plaintiffs’ prima facie case of discrimination. They do not do so, however. Studies based on Census labor pool data rarely overcome studies based on actual applicants. Each data base produces some biases. When used to study unskilled jobs, each set of biases will favor the employer. As a result, an inference of discrimination from either is strong, and failure to replicate the inference using other data does not support a finding of no discrimination.

Statistical analysis of the actual applicants has the advantage of self-selection: the study examines how the employer actually treated the people who wanted the job. Applicant studies are preferable as a rule because Title VII governs the treatment of applicants. What better statistical base than the people to whom Title VII applies? See Movement for Opportunity and Equality v. General Motors Corp., 622 F.2d 1235, 1245 (7th Cir.1980). Cf. New York Transit Authority v. Beazer, 440 U.S. 568, 582-87, 99 S.Ct. 1355, 1364-66, 59 L.Ed.2d 587 (1978). See also, e.g., David C. Baldus & James W.L. Cole, Statistical Proof of Discrimination § 4.11 (1980). Self-selection introduces biases into the pool, however. One obvious effect is that if employers recruit marginally qualified people in an effort to make the job more readily available to minority workers, the difference in average levels of skill also may explain a lower rate of hiring the' minority applicants. Baldus & Cole at § 4.124. The ICG does not say that it recruited black applicants. There are other self-selection effects, however. To see this, consider how people decide to apply for the job.

People apply because they want the job more than their current employment (or prefer it to unemployment). Even those who want the job will not apply, however, unless they think they have some chance of being hired; They must travel to Carbon-dale, which may be a long and costly trip (costly not only in money but also in time, which includes income from other work foregone). To determine whether a study of applicants will identify an employer’s rule of decision (taking account of race and taking account of distance are rules of decision), we must first know how the rule of decision influences who applies.

Take the first possibility, that the firm is using race as a rule of decision. If this is known — either to applicants or to those ubiquitous informational intermediaries, employment agencies, which act as applicants’ proxies — it will deter blacks from applying. Cf. Dothard v. Rawlinson, 433 U.S. 321, 330, 97 S.Ct. 2720, 2727, 53 L.Ed. 2d 786 (1977) (sex discrimination). The firm could hire as large a percentage of the (few) black applicants as of white applicants while still discriminating. Discrimination not only reduces the number of black applicants but also increases their average level of skills. Only a (relatively) superior black worker would find the costs of applying worthwhile. The observation that the firm hires black and white applicants in such a pool with equal probability would show only that it is happy to hire high-quality blacks while it is satisfied to use medium-quality whites — people who could not get in the door if they were black. To put this differently, discrimination affects the applicant pool in a way that makes the discrimination harder to detect. The discriminating employer induces qualified blacks not to apply, and these non-applicants — victims of discrimination as much as the non-hired applicants — will make the employer’s hiring look “better” than it is. An applicant pool analysis is biased against finding discrimination, if potential applicants know or suspect that the employer is discriminating. If a study based on applicants nonetheless implies discrimination, this is potent evidence.

An employer’s decision to take distance into account also affects who applies. If the employer announces that it will hire only people who live within 50 miles of work, then only people with this characteristic will apply. It will make no difference whether blacks in the St. Louis district live farther from the ICG’s yards than whites; the distant blacks will select themselves out. If white workers predominate in the vicinity of the ICG’s worksites, and people know of the ICG’s preferences, most applicants will be white. In equilibrium people will self-select so that each applicant has an equal chance of being hired — which will mean, if the ICG is race-blind, that the average distance from work of black and white applicants is identical. Whether or not black people in the district live farther from the ICG, black applicants should be situated similarly to white applicants. This, too, produces a bias in the ICG’s favor. If what the ICG says about its hiring policy is true, hiring rates of actual applicants by race should be similar even if the- average distances of potential applicants are dissimilar. When the study shows a substantial disparity in hiring rates even among applicants, it has made a strong case.

Now consider the potential bias in a study based on population rather than applicants. The labor force in any county contains many workers satisfied with their jobs, perhaps because they are already earning more than the ICG offers to unskilled labor. Because black workers in any given labor force earn lower wages than their white counterparts, 1987 Statistical Abstract of the United States Table 680, and are unemployed more frequently, id. at Tables 664, 667, they are more likely to be available for unskilled work than are white workers in the same labor force. Knowledge that 10% of the workers in the “diminished civilian labor force” of a given county are black does not imply that only 10% of the laborers realistically available to the ICG are black. The ICG’s offer will beat the status quo more frequently for black laborers than for white laborers. If 10% of the labor force in a given market is black, 20% or more of the laborers available to the ICG may be black. A study disclosing that the employer’s hires match the color distribution of the “diminished civilian Labor force” (in this hypothetical, that 10% of the laborers are black) does not permit a confident inference that the employer is not discriminating.

The record in this case contains both applicant and labor force studies. Each kind of study, for unskilled labor, is biased against finding discrimination. When the applicant pool study produces a powerful inference of discrimination, and the labor force study produces at best a weak inference of race-neutral action, the plaintiffs have done what they must.

Another, more general, effect is at work, and it cuts in the other direction. Either kind of study can overstate discrimination because of the way statistical methods deal with unknown variables. Much statistical work — including that in this case — implicitly attributes “unexplained” variance to race. Yet labor markets are poorly understood; to chalk up all unexplained variation to race is to find more discrimination than there is. There is also a selection bias in selecting cases to be litigated. A statistician calls a result “significant” if there is less than one chance in 20 that it is consistent with the null hypothesis (here, race-blind hiring). In a world with 1000 employers, 50 will meet this criterion of significance by chance. These will become targets of litigation, and courts are apt to find discrimination in all 50 cases even though by hypothesis none of the 1000 employers is discriminating. Because many employers have multiple divisions, and each job for each year in each division may be the subject of a separate study, each employer is bound to have some job categories in which statistical techniques will show chance variations that permit the inference of discrimination. See David W. Barnes, The Problem of Multiple Components or Divisions in Title VII Litigation: A Comment, 46 L. & Contemp.Prob. 185 (Aut. 1983).

These troubling effects of statistical inferences require thoughtful consideration in each case. Courts ought not demand that employers do the impossible and explain behavior that baffles the best labor economists. And it would be most unpleasant if statistical techniques were simply being used to find the inevitable imbalances in a population of non-discriminating firms. Parties should address explicitly how to handle unexplained variance. There are several potentially useful methods. A court may use the EEOC’s “80% rule” (which Professors Meier, Sacks & Zabell explain and endorse, and which was satisfied here). Perhaps a court should say that only statistical models explaining a substantial portion of the variance may be used to infer discrimination (which Professor Campbell believes the best approach). See also Richard Goldstein, Two Types of Statistical Errors in Employment Discrimination Cases, 26 Jurimetrics 32 (1985). The disparities in this case are so striking, however, and the ICG’s efforts to explain them so unsuccessful, that the tendency of statistical devices in general to assign as “discrimination” too much of what is simply puzzling does not assist the ICG. The prima facie case of discrimination stands, and as we have concluded that the ICG’s studies do not satisfy the employer’s need to articulate a reason unrelated to race, the class must prevail on the merits.

HI

Robert Earl Mister, the class representative, pressed a disparate treatment claim on his own behalf. The district court resolved this adversely to Mister. 639 F.Supp. at 1581-82. This resolution is not clearly erroneous. Mister lied on his application about his employment history, falsely denying that he had been fired from an earlier job; it is undisputed that the ICG would not have hired an applicant of any race who lied in this fashion (though it is open to question whether the ICG would have checked Mister’s history before hiring him). Moreover, after Mister filed his charge of discrimination with the EEOC, the ICG twice offered him employment, and he twice refused. The district court was entitled to conclude that Mister was not personally the victim of disparate treatment.

The judgment is affirmed to the extent it concerns Mister’s personal claim. The remainder of the judgment is reversed, and the case is remanded for the award of appropriate relief. The class shall recover its costs. Circuit Rule 36 shall not apply on remand. 
      
      . The distribution is not "normal”, so the t-test may not be the best, but its values are so large (and there are so many degrees of freedom) that we have confidence that any different method of analysis would not come up with a different answer.
     
      
      . One caveat. Societal discrimination against black workers would reduce the opportunity cost of their time. That is, black applicants on average would be giving up less income from other employment to apply for a new job, and therefore would travel farther and accept lower probabilities of being hired. The railroad then would hire a smaller portion of black applicants without engaging in discrimination. This undercuts, to an unknowable degree, both self-selection effects discussed in the text. It is impossible to believe — and the ICG does not contend — that this counterweight is large enough to account for disparate effects of the magnitude the plaintiffs established.
     
      
      . The bias comes from the supposition that the employer is hiring unskilled labor. It can easily be reversed if the employer is hiring skilled labor. That 10% of the workers in the labor force are black does not imply that 10% of the workers qualified to be aeronautical engineers or professors of Greek are black. Proof that an employer of engineers hired in proportion to the race of the labor force as a whole would acquit it of discriminating against blacks. We mention this to reinforce the point that it is necessary to review the inferential process case by case; the biases of one study in one case may be avoided or reversed in the next.
     