
    UNITED STATES of America, Plaintiff, v. CITY OF YONKERS; Gerald Loehr as Mayor of the City of Yonkers; Charles Connolly as Commissioner of Police of the City of Yonkers; City of White Plains; Alfred Del Vecchio, as Mayor of the City of White Plains; John Dolce as Commissioner of Public Safety of the City of White Plains; New York State Department of Civil Service; Victor S. Bahou, as President and Commissioner of the New York State Department of Civil Service; Josephine J. Gambino and James T. McFarland, as Commissioners of the New York State Department of Civil Service, Defendants.
    No. 80 Civ. 7407 (ADS).
    United States District Court, S.D. New York.
    Sept. 17, 1984.
    
      Rudolph W. Giuliani, U.S. Atty. for S.D. of N.Y., New York City, for plaintiff; Jane E. Bloom, Kathleen R. Roberts, Jane E. Booth, New York City, of counsel.
    Epstein Becker Borsody & Green, P.C., New York City, and Arthur J. Doran, Jr., Corp. Counsel, Yonkers, N.Y., for Yonkers defendants; Susan Schenkel-Savitt, Richard L. Steer, New York City, of counsel.
    Robert Abrams, Atty. Gen. of the State of N.Y., New York City, for State defendants; Melvyn R. Leventhal, Stanley A. Camhi, Martha 0. Shoemaker, New York City, of counsel.
   OPINION AND ORDER

SOFAER, District Judge:

The United States commenced this action against the City of Yonkers, its Mayor and Police Commissioner, the New York State Department of Civil Service, and its President and Commissioners in December 1980. The suit challenged hiring procedures for the Yonkers Police Department, including written examinations administered in 1972, 1973, and 1977, physical agility tests administered in 1973 and 1977, and a height requirement in effect until 1973. The complaint alleged that defendants had engaged in a pattern and practice of discrimination on the basis of race and gender in violation of Title VII of the Civil Rights Act of 1964, as amended, 42 U.S.C. § 2000e-2000e-17; the State and Local Fiscal Assistance Act of 1972, 31 U.S.C. § 6716; the Comprehensive Employment and Training Act of 1973, Pub.L. No. 93-203, § 612, 87 Stat. 839, 882 (repealed 1982); and the Omnibus Crime Control and Safe Streets Act of 1968, 42 U.S.C. § 3789d.

In an earlier opinion, this court denied motions for summary judgment filed by both the Yonkers and State defendants. See United States of America v. City of Yonkers, 592 F.Supp. 570 (S.D.N.Y.1984). We declined at that time to decide defendants’ Rule 41(b) motions to dismiss for failure to establish a prima facie case, in order to afford defendants an opportunity further to cross-examine the government’s expert-witness, Dr. Bernard R. Siskin. Defendants having decided not to cross-examine Dr. Siskin further at this time, this memorandum and order now assesses the sufficiency of the United States’ prima facie case.

In support of their motions, defendants make several preliminary legal arguments. The defendants first contend that disparate impact analysis is only appropriate to challenges to “specific, limited, facially neutral employment criteria,” whereas here the United States mounts an “overall challenge” to an allegedly “pervasive pattern or practice of discrimination.” Yonkers Defendants’ Pretrial Brief at 17-18 (April 23, 1984). At trial the United States adduced sparse anecdotal evidence of discriminatory attitudes toward and treatment of minority and female applicants, elicited testimony on patterns of assignment and promotion, attempted to establish the inadequacy of Yonkers’ recruitment efforts, and tried to bring out a perception in the minority community of alienation from the Yonkers police force. The United States now firmly represents, however, that it founds this action wholly on the disparate impact model of employment discrimination and attacks only certain facially neutral selection criteria in the hiring of Yonkers police officers. United States’ Brief in Support of its Prima Facie Case at 1-2, 10 (May 18, 1984); see International Brotherhood of Teamsters v. United States, 431 U.S. 324, 335 n. 15, 97 S.Ct. 1843, 1854 n. 15, 52 L.Ed.2d 396 (1977); B. Schlei & P. Grossman, Employment Discrimination Law 80-205 (2d ed. 1983). In any event, for the most part the evidence of particular instances of discriminatory conduct was unreliable and altogether insufficient. The United States explains that it offered other evidence — such as the refusal of Yonkers officials to allow one woman to retake the physical fitness test when a dispute arose over whether she had run the required distance in one event, and their refusal to allow another woman to take the medical examination when she claimed she had not received notice — only to demonstrate that Yonkers “rigidly applied rules” in a way that precluded affirmative action and that “Yonkers treated minorities and women exactly like white men.” Reply Brief in Support of its Prima Facie Case at 6 n.*. But Yonkers had no obligation to run its testing procedures on a discriminatory basis.

The United States also failed to establish that Yonkers violated Title VII by failing adequately to recruit minorities and women. See e.g., Association Against Discrimination in Employment, Inc. v. City of Bridgeport, 479 F.Supp. 101, 110 (D.Conn.1979), affd in part and vacated, in part on other grounds, 647 F.2d 256 (2d Cir.1981), cert, denied, 455 U.S. 988, 102 S.Ct. 1611, 71 L.Ed.2d 847 (1982). Assuming that Yonkers had a reputation for discrimination and hence an obligation to recruit, the evidence concerning Yonkers’ recruitment efforts suggested that Yonkers engaged in increasingly vigorous and effective campaigns before each test. While Yonkers Personnel Director Jack Christiansen insisted that the City had conducted some minority recruiting in 1972, no other witness could recall such an effort. In 1973, however, Police Chief Polsen issued a press release in English and Spanish announcing the date of the examination and expressing Yonkers’ interest in attracting candidates from all groups in the community in order to remedy minority underrepresentation in the police department. More importantly, the Yonkers Urban League, the Administrator of the Human Rights Commission, and Rita Nelson, a Yonkers police officer, organized a recruitment and training program. ' Nelson testified that she and other Yonkers police officers participated in this campaign and served as instructors for a series of eight, two-hour training sessions; that several ranking officers contributed funds from their own pockets; that the police department contributed the use of its facilities; and that a Spanish-speaking officer was dispatched to speak with Hispanic community groups. A local newspaper and Christiansen testified that Yonkers sought public service spots on local radio. The City prepared and distributed recruitment posters and pamphlets; it also set up a recruitment center in a minority neighborhood and detailed minority police officers to do recruiting. In 1988 Yonkers hired James Barrier, a retired black police lieutenant, to head the 1983 effort. In its campaign evaluation following the 1977 test, Yonkers had observed that, while it had succeeded in attracting significant numbers of women and minorities to register for the exam, members of these groups proved much less likely actually to appear for the test. Yonkers attempted to remedy this problem for the 1983 test by arranging to transport candidates to the testing sites. In addition, it again opened a recruitment center, distributed literature, and detached minority police officers for fulltime recruiting. Yonkers also conducted an extensive training program, with 32 teachers leading classes of 25 persons apiece. All witnesses agreed that the 1983 recruitment effort had been highly successful. Numerical results, which showed that the percentage of women and minorities that had signed up for and taken the examination substantially exceeded their representation in the relevant labor force, bore out this assessment.

The 1973 and 1977 recruitment campaigns were by no means exemplary. The 1977 training program, for example, was criticized for comparing poorly with the test actually administered. And Hispanic community leaders continued to criticize outreach efforts to their community. Overall, however, the record reveals gradually improving and ultimately successful efforts to eliminate any prevailing perceptions in the minority communities that the Yonkers police force was not accessible to them. See Transcript at 922.

Defendants argue further that the evidence thus far presented is insufficient to establish a prima facie case of disparate impact of the selection procedures on blacks, Hispanics, and women. But the United States identifies three specific employment criteria — the 1972, 1973, and 1977 written examinations, the 1973 and 1977 physical fitness tests, and the height requirement — which on their face might well be “facially neutral in their treatment of different groups but ... in fact fall more harshly on one group than another.” Teamsters, 431 U.S. at 335 n. 15, 97 S.Ct. at 1854 n. 15. A fourth criterion — the 1972 exclusion of women — is discriminatory on its face. Whatever the utility of disparate impact analysis in evaluating subjective employment criteria or in mounting “a wide ranging attack on the cumulative effect of a [defendant’s] employment practices,” Pouncy v. Prudential Insurance Co., 668 F.2d 795, 800 (5th Cir.1982); compare id. with Schlei & Grossman at 1288 & n. 15, this suit, as presently pressed, is an appropriately circumscribed challenge to specific, objective selection criteria. Compare Carroll v. Sears, Roebuck & Co., 708 F.2d 183, 188-89 (5th Cir.1983) (Wisdom, J.) (employment practices related to training, promotion, termination, and job classification susceptible only to disparate treatment analysis), with id. at 189 (facially neutral testing program subject to disparate impact analysis).

Defendants also contend that as a matter of law disparate impact analysis cannot establish a pattern or practice of discrimination within the meaning of section 707(a). The argument confuses the breadth of discrimination necessary to support the Attorney General’s suit with the means available to prove it. The Supreme Court has advised that in a pattern-or-practice action the government “ultimately had to prove more than the mere occurrence of isolated or ‘accidental’ or sporadic discriminatory acts,” Teamsters, 431 U.S. at 336, 97 S.Ct. at 1855, “ ‘single, insignificant, isolated acts of discrimination’ ” would not do, id. at 336 n. 16, 97 S.Ct. at 1855 n. 16 (quoting Senator Humphrey, 110 Cong.Rec. 14270 (1964)). But as the Court’s comparison in Teamsters of the disparate treatment and disparate impact analyses in the context of a section 707(a) suit suggests, see id. at 335 n. 15, 97 S.Ct. at 1854 n. 15, a pattern of discrimination might be proved by either widespread disparate impact or regular disparate treatment. “Either theory may, of course, be applied to a particular set of facts.” Id. For this reason, courts have employed disparate impact analysis in pattern-or-practice suits without comment. See, e.g., United States v. City of Chicago, 573 F.2d 416, 424 (7th Cir.1978) (suit by Attorney General under 42 U.S.C. 2000e et seq.); Walls v. Mississippi State Department of Public Welfare, 542 F.Supp. 281, 309-10 (N.D.Miss.1982) (government challenge under section 707 to educational qualifications and testing program utilizing disparate impact analysis), affd in part, rev’d in part on other grounds, 730 F.2d 306 (5th Cir.1984); United States v. City of Buffalo, 457 F.Supp. 612, 621-22 (N.D.N.Y.1978), modified on other grounds, 633 F.2d 643 (2d Cir.1980). Moreover, because by definition disparate impact analysis must focus on much more than isolated occurrences, it lends itself to uncovering a proscribed pattern or practice. As a practical matter, a well-documented statistical assault might well reveal a pattern of the identified effect. See, e.g., United States v. City of Chicago, 385 F.Supp. 543, 553-54 (N.D.Ill.1974).

Pointing next to the language of section 707(a) which requires that the Attorney General have reason to believe that the identified pattern or practice “is of such a nature and is intended to deny the full exercise” of statutorily protected rights, defendants argue that in order to establish a pattern or practice within the meaning of the statute the United States must demonstrate intention in the constitutional sense. See, e.g., Washington v. Davis, 426 U.S. 229, 238-39, 96 S.Ct. 2040, 2046-47, 48 L.Ed.2d 597 (1976). The courts have uniformly refused to read such a requirement into the statute, however, but have instead respected the evident congressional purpose that the statute reach conduct which is intentional in the relatively innocuous sense of advertence. See Local 189, United Papermakers and Paperworkers v. United States, 416 F.2d 980, 996-97 (5th Cir.1969) (intent requisite to § 707(a) suit “may be inferred from the fact that defendants persisted in the. conduct after its racial implications had become known to them”), cert, denied, 397 U.S. 919, 90 S.Ct. 926, 25 L.Ed.2d 100 (1970); Walls v. Mississippi State Department of Public Welfare, 542 F.Supp. 281, 309 n. 17 (N.D.Miss.1982), affd in part, rev’d in part on other grounds, 730 F.2d 306 (5th Cir.1984); United States v. Local 357 of the International Brotherhood of Electrical Workers, 356 F.Supp. 104, 117 (D.Nev.1973); see also United States v. Bethlehem Steel Corp., 312 F;Supp. 977, 993 (W.D.N.Y.1970); modified, 446 F.2d 652 (2d Cir.1971); see also Griggs v. Duke Power Co., 401 U.S. 424, 432, 91 S.Ct. 849, 854, 28 L.Ed.2d 158 (1971) (“Congress directed the thrust of the Act to the consequences of employment practices, not simply the motivation” (emphasis in original)).

The proof introduced thus far at trial establishes a prima facie case of discrimination. As the Supreme Court has explained:

to establish a prima facie case of discrimination, a plaintiff need only show that the facially neutral standards in question select applicants for hire in a significantly discriminatory pattern. Once it is thus shown that the employment standards are discriminatory in effect, the employer must meet “the burden of showing that any given requirement [has] ... a manifest relationship to the employment in question.” Griggs v. Duke Power Co., [401 U.S. 424, 432, 91 S.Ct. 849, 854, 28 L.Ed.2d 158 (1971) ]. If the employer proves that the challenged requirements are job related, the plaintiff may then show that other selection devices without a similar discriminatory effect would also “serve the employer’s legitimate interest in ‘efficient and trustworthy workmanship.’ ” Albemarle Paper Co. v. Moody, [422 U.S. 405, 425, 95 S.Ct. 2362, 2375, 45 L.Ed.2d 280 (1975) (quoting McDonnell Douglas Corp. v. Green, 411 U.S. 792, 801, 93 S.Ct. 1817, 1823, 36 L.Ed.2d 668 (1973) ].

Dothard v. Rawlinson, 433 U.S. 321, 329, 97 S.Ct. 2720, 2726, 53 L.Ed.2d 786 (1977); see, e.g., Connecticut v. Teal, 457 U.S. 440, 446-47, 102 S.Ct. 2525, 2530-31, 73 L.Ed.2d 130 (1982). At this point, the United States need only demonstrate a “significantly discriminatory” effect from the challenged procedures in order to put the defendants to their rebuttal. Statistical evidence is wholly competent for the purpose. See, e.g., Teamsters, 431 U.S. at 339-40 & n. 20, 97 S.Ct. at 1856-57 & n. 20; Bushey v. New York State Civil Service Commission, 733 F.2d 220, 225 (2d Cir.1984); Kirkland v. New York State Department of Correctional Services, 711 F.2d 1117, 1131-32 (2d Cir.1983), cert, denied, — U.S. —, 104 S.Ct. 997, 79 L.Ed.2d 230 (1984); United States v. Wood, Wire and Metal Lathers International Union, Local No. 46, 471 F.2d 408, 414 n. 11 (2d Cir.1973); United States v. Ironworkers Local 86, 443 F.2d 544, 551 (9th Cir.), cert, denied, 404 U.S. 984, 92 S.Ct. 447, 30 L.Ed.2d 367 (1971); Schlei & Grossman, supra, at 1287 (“Statistics are almost totally determinative in adverse impact cases”). The findings as to Title VII apply equally to the Revenue Sharing Act, 31 U.S.C. § 6716(a), see Cohen v. West Haven Board of Police Commissioners, 638 F.2d 496, 499 & n. 4 (2d Cir.1980); United States v. State of New York, 82 F.R.D. 2, 4-5 (N.D.N.Y.1978); Officers for Justice v. Civil Service Commission, 20 Fair Empl.Prac.Cas. (BNA) 179, 185 (N.D.Cal.1978); and to the Omnibus Crime Control and Safe Streets Act, 42 U.S.C. § 3789d(c)(1), see United States v. New York, 82 F.R.D. 2, 4-5; Officers for Justice, 20 Fair Empl.Prac.Cas. (BNA) at 185. Although no court has applied Title VII standards to the Comprehensive Employment and Training Act, the provisions of that statute are similar to those of Title VII, see Comprehensive Employment and Training Act, Pub.L. No. 93-203, § 612(a), 87 Stat. 839, 882 (1973) (repealed 1982).

The United States relies heavily on the report of Dr. Bernard R. Siskin, first presented as an appendix to the United States’ Pretrial Brief but revised several times both during the prima facie phase of trial and well after its close. See, e.g., Siskin Report (April 1984) (appendix to United States Pretrial Brief); Siskin Report (amended April 29, 1984) (Gov’t Exhibit 26E); Siskin Report (July 5,1984). In summary, Dr. Siskin testified that the 1972, 1973, and 1977 written tests had a statistically significant disproportionate impact on black and Hispanic test-takers, that the 1973 and 1977 physical agility tests had a statistically significant disproportionate impact on female test-takers, and that the height requirement had a statistically significant disproportionate impact on female applicants.

The Uniform Guidelines on Employee Selection of the Equal Employment Opportunity Commission prescribe a “four-fifths rule” by which federal agencies charged with the enforcement of equal opportunity law identify significant adverse impact of selection procedures:

A selection rate for any race, sex, or ethnic group which is less than four-fifths (Vb) (or eighty percent) of the rate for the group with the highest rate will generally be regarded by the Federal enforcement agencies as evidence of adverse impact, while a greater than four-fifths rate will generally not be regarded by Federal enforcement agencies as evidence of adverse impact.

See 29 C.F.R. § 1607.4(D); see also 28 C.F.R. § 50.14; 5 C.F.R. § 300.103(c); 41 C.F.R. § 60-3.1. Though not binding, the courts have accorded the guidelines “great deference.” Griggs, 401 U.S. at 433-34, 91 S.Ct. at 854-55; see Bushey, 733 F.2d at 225. Alternatively, the Supreme Court has approved the use of standard deviation analysis to assess the significance of disproportionate impact. See Casteneda v. Partida, 430 U.S. 482, 496 n. 17, 97 S.Ct. 1272, 1281 n. 17, 51 L.Ed.2d 498 (1977); Hazelwood School District v. United States, 433 U.S. 299, 308 n. 14, 97 S.Ct. 2736, 2742 n. 14, 53 L.Ed.2d 768 (1977). ‘[I]f the difference between the expected value [from a random selection] and the observed number is greater than two or three standard deviations,’ a prima facie case is established since the deviation then could only be regarded as caused by some factor other than chance.” Kirkland v. New York State Department of Correctional Services, 711 F.2d 1117, 1131 (2d Cir.1983) (quoting Casteneda, 430 U.S. at 496 n. 17, 97 S.Ct. at 1281 n. 17), cert. denied, — U.S. —, 104 S.Ct. 997, 79 L.Ed.2d 230 (1984).

Evaluated under either of these standards, Dr. Siskin’s statistical report demonstrates that the 1977 written examination had a sufficiently significant disproportionate impact upon black and Hispanic applicants to establish a prima facie case. His report indicates that blacks passed the 1977 test at only 58% the rate of whites, and Hispanics at only 57% that rate. See Tables 201, 401. Studies done by the State and by Yonkers confirm Dr. Siskin’s results as to the 1977 test, at least in broad outline. See Gov’t Exhibits 576, 216. In addition, under either means of evaluation, Dr. Siskin’s statistics show a significant disparate impact upon black applicants in 1973, when blacks passed at only one-quarter the rate of whites. See Table 201. However, while the pass rates for blacks in 1972 and Hispanics in 1972 and 1973 are below eighty percent those of whites, the small samples render the disproportion statistically insignificant in units of standard deviation: for blacks in 1972, 1.19 units; for Hispanics in the same year, 1.88; and for Hispanics in 1973, 1.28. See Tables 201, 401; Transcript 270. The results are roughly comparable using an effective pass rate, which counts as passing only those who received the minimum score necessary actually to be selected, rather than simply placed on an eligible list. See Table 901.

The United States argues that the similarity of the 1972 and 1973 tests makes it appropriate to aggregate the samples for those years, in which event the units of standard deviation for black test-takers rises slightly above three and for Hispanic test-takers slightly above two. In Bridgeport Guardians, Inc. v. Members of the Bridgeport Civil Service Commission, 354 F.Supp. 778, 784-86 (D.Conn.) (Newman, J.), aff'd in relevant part, rev’d in part on other grounds, 482 F.2d 1333 (2d Cir.1973), cert, denied, 421 U.S. 991, 95 S.Ct. 1997, 44 L.Ed.2d 481 (1975), and Chance v. Board of Examiners, 330 F.Supp. 203, 211-12 (S.D. N.Y.1971), aff'd, 458 F.2d 1167 (2d Cir. 1972), the courts aggregated the results of multiple tests before assessing the hypothesized disparities. Arguing that such an approach would be inappropriate here, defendants identify trivial differences between the tests — for example, that fifteen of the multiple-choice questions on the 1972 test had five choices instead of four — and point out that the test questions were not precisely identical. See Transcript at 1435-36, 1482. But the evidence at trial showed that the tests were substantially similar: both consisted of 75 multiple choice questions, including 30 on understanding legal passages, 30 on judgment in situations likely to arise in police work, and 15 on preparing written reports. The scoring formula for each exam was the same. In addition, the deposition testimony of the since-deceased Dr. Sol Kabel, the Associate and then Principal Personnel Examiner at the Department of Civil Service until 1981, which was admitted over defendants’ objections in an unrelated proceeding, demonstrated that both examinations came from the same series and thus were similar in design, concept, and content. See Kabel Deposition at 126-30, 141-44. Further, the state’s use of the tests for the identical position should provide adequate assurance of similarity, for its interest in fair and consistent selection procedures should have provided strong inducement to minimize variations in two tests given for an identical position within the space of one year. See Bridgeport Guardians, 354 F.Supp. at 784.

Moreover, the Casteneda threshold operates to establish an inference as to discrimination only above the line; there is no converse presumption ascribing to chance levels of probability below “two or three.” Equal Employment Opportunity Commission v. American National Bank, 652 F.2d 1176, 1192 (4th Cir.1981) (Phillips, J.), cert, denied, 459 U.S. 923, 103 S.Ct. 235, 74 L.Ed.2d 186 (1982). “[W]ell short of three standard deviations the probability levels for chance as explanation have already dropped far below the point at which courts of law — concerned with proof by the ‘greater weight’ or ‘preponderance’ of the evidence — would presumably have discarded the hypothesis of chance.” Id.; see Guardians Association of New York City Police Department, Inc. v. Civil Service Commission, 630 F.2d 79, 86 n. 4 (2d Cir. 1980) (recognizing that likelihood that actual results will fall more than two standard deviations beyond the expected results is only about 4.6%), cert, denied, 452 U.S. 940, 101 S.Ct. 3083, 69 L.Ed.2d 954 (1981). In this case, aggregating the impact of the 1972 test on all minorities by adding the figures for blacks to those for Hispanics produces a disparity of 2.04 units of standard deviation. See Table 403. While not conclusive of discriminatory effect under Casteneda, this level of probability certainly has some probative value. See American National Bank, 652 F.2d at 1192. Finally, the ratio of black passers to white passers on the 1972 test is below the eighty-percent threshold of the Uniform Guidelines, and the ratio of Hispanic passers to white passers on the 1972 and 1973 tests is well below that threshold.

Under these circumstances, the United States has established a prima facie case in the categories it seeks to aggregate. Contrary to Yonkers’ contention, the Chance court did not find aggregation inappropriate. It began its survey of the statistical data by noting “the aggregate pass-fail statistics for the entire group,” which comprised fifty examinations extending over seven years. 330 F.Supp. at 209-10. The court did find the use of small samples troublesome, observing that the low number of minority applicants taking forty-one of the fifty examinations meant that “each such examination, when analyzed individually, cannot be accorded much weight or significance,” and finding it impossible to draw “meaningful conclusions as to differences ... from any one of these 41 examinations.” Id. at 212 (emphasis supplied). It therefore “preferred] the greater fact content found in the larger sample” available for the remaining nine tests. Id. But as its recitation of the aggregate figures makes clear, it accorded these numbers probative value.

The use of aggregation does not mean, as Yonkers seems to assume, that the evidence involved necessarily takes on more significance. Statistics in whatever form are but circumstantial evidence for a given factual hypothesis. See, e.g., Teamsters, 431 U.S. at 340, 97 S.Ct. at 1856; Chance, 458 F.2d at 1173. A need to aggregate to achieve statistical significance may lessen the probative force of the data, which will vary as well with the size of the aggregate sample and the degree of similarity in the tests. Defendants will have the opportunity on rebuttal to attack the evidentiary significance of the aggregate. Furthermore, the need to aggregate may be highly relevant to the remedies that are appropriate if a violation is ultimately established. The data for 1972 and 1973 constituted a far weaker signal to Yonkers of the potentially discriminatory impact of its test than was signalled by the 1977 test, so the propriety of imposing drastic corrective measures for the consequences of using the tests in the earlier years becomes far more questionable.

Yonkers barred women from taking the 1972 patrolman’s examination, allegedly in reliance on state instructions. Yonkers Defendants’ Proposed Factual Chronology ¶¶ 27-28, 33 (June 5, 1984); Gov’t Exhibits 113-14; Transcript 742. Dr. Siskin’s report indicates that the 1973 and 1977 physical agility tests administered to applicants who had successfully completed the written examination had a disparate impact upon women: in 1973 no women passed the test, while 88% of the male candidates did, yielding a disparity of 4.74 units of standard deviation, and in 1977 the percentage of female passers was only 68% that of male passers, yielding a disparity of 4.48 units of standard deviation. See Table 1201. Defendants argue that the small sample sizes make these figures unreliable. The Supreme Court has cautioned that small sample size “may, of course, detract from the value of [statistical] evidence,” Teamsters, 431 U.S. at 340 n. 20, 97 S.Ct. at 1856 n. 20, but standard deviation analysis takes account of sample size as well as degree of disparity in measuring statistical significance. See Transcript 213; Dendy v. Washington Hospital Center, 581 F.2d 990, 992 (D.C.Cir.1978) (per curiam) (“While the numbers involved may have appeared small [35 test-takers, of whom nine were black], statistical analysis showed them to reflect a discriminatory impact that could not reasonably be ascribed to chance alone.”). For the 1973 test, the disparity is so great that the relatively small sample reaches well into the statistically significant range, whereas for the 1977 test a larger sample yields similar results despite the smaller disparity.

Defendants also argue that individual circumstances account for several of the female failures in both tests and that removing these women from the samples would dilute the results into statistical insignificance. Yonkers suggests that in 1973 only two of the four women who took the test actually “failed” because the other two “voluntarily withdrew.” Yonkers does not contend, however, that either woman accrued the minimum points necessary to pass the test before withdrawing, though it suggests that one “could have”; nor does it contend that the personnel administering the test would not have regarded men who withdrew in this manner also as having failed. The women presumably withdrew because they determined that they could not pass. In any case, absent evidence that men were scored differently in this regard, no reason exists to subtract these two women from the failures. Defendants likewise argue that the 1977 sample is flawed because one unsuccessful candidate testified that she had not gotten proper rest before her first try and was ill for her second; because another testified that she would have passed had she not been required to run a disputed extra lap; and because a third testified that she would have passed had she attended the exam but never received notice that it would take place. Yonkers has argued strenuously, however, that it treated all candidates alike in notification and administration of the test. Absent detailed evidence of the state of readiness of each candidate, the exactitude with which each test was administered, or the proportion of candidates who might, through “administrative problems,” have failed to receive notice, the peculiar circumstances of these or any other candidates do not affect the reliability of the sample as a whole.

Finally, where the sample size is relatively small it is appropriate to look to nonstatistical evidence. Here it is uneontroverted that Yonkers did not permit women to take the 1972 patrolman’s examination. Initially, Yonkers Municipal Civil Service Commission personnel refused applications from women for the 1973 test as well, upon the advice of Joseph Silverman of the Municipal Service Division of the State Civil Service Division, see Transcript 1129-30, 1405, 1489; Gov’t Exhibit 1204, but changed its stance some weeks later when Joseph Watkins of the same office advised it to accept women’s applications so long as they could meet other requirements specified in the announcement, see Transcript 1489; Gov’t Exhibit 1204. Yonkers’ reluctance to accept applications from women contributed to the small number of those who took the written test, and thus to the small sample available for the physical test. Yonkers’ conduct further pruned that sampie when it rejected six of the eight female candidates who had passed the written examination for failure to meet the 5'8" minimum height requirement. After two of the rejected candidates filed complaints with the New York State Division of Human Rights, Yonkers agreed to permit the women to take the physical agility test. While the parties dispute whether Yonkers ever notified the underheight women who had not complained, compare Gov’t Exhibit 165 with Transcript 1408-09,1411, only the two complainants appeared. Yonkers should not be permitted to question statistically significant findings on the basis of sample size where the small size of the sample was caused, at least in part, by Yonkers’ own conduct. Cf. Boston Chapter, NAACP, Inc. v. Beecher, 504 F.2d 1017, 1021 (1st Cir.1974) (small sample size suggesting discriminatory effect should not be disregarded where limited data “may be traceable to the test’s discouraging effect as well as to unequal recruitment practices”), cert, denied, 421 U.S. 910, 95 S.Ct. 1561, 43 L.Ed.2d 775 (1975).

The 5'8" height requirement also had a disparate impact on women. The Supreme Court has approved the use of national statistics to demonstrate the effect of height requirements. See Dothard v. Rawlinson, 433 U.S. 321, 329-30, 97 S.Ct. 2720, 2726-27, 53 L.Ed.2d 786 (1977); see also Vanguard Justice Society, Inc. v. Hughes, 471 F.Supp. 670, 710 (D.Md.1979). Figures from the National Center for Health Statistics for 1971 through 1974 indicate that at the time over 96% of American women were under that height while less than 28% of American men were. See Table 1801. While Yonkers abandoned the requirement before the actual administration of the written exam, its inclusion in the announcement and Yonkers’ initial insistence upon it may have discouraged at least some women from applying.

The record made thus far also includes evidence concerning a decision of Yonkers officials which modestly exacerbated the impact of the City’s 1973 test. In December 1974 Yonkers established an eligibility list from the 1973 examination, which was originally slated to remain in effect for two years. Around that time the Municipal Services Department indicated that the state would administer police officer examinations in February and October 1975. Christiansen recommended on February 3, 1975, however, that the Commission not participate because Yonkers had “adequate lists going through at least August of 1977,” referring to the eligibility list from the 1972 exam, which would not expire until that time, as well as that from the 1973 test. Christiansen conceded that at the time he made this recommendation he did not take into account female and minority representation on the list. The Commission voted to extend the life of the 1973 list an additional year to December 1977. Commissioner Carey testified that the Commission had no information on the makeup by race and gender of the list when they decided to extend it. Christiansen credibly ascribed his reluctance to hold another exam to the severe fiscal crisis Yonkers was then suffering and to his belief that, given the extensive lists already in existence, a new examination would hold out but false hope of eventual employment. Although only twelve appointments — all white males — were made from the 1973 list, this impact nevertheless resu! d from a deliberate, albeit good faith, decision. In 1977 Christiansen recommended that the Commission not extend the 1973 list an additional year explicitly in order to facilitate “a concentrated effort to attract a greater representation of minorities” on a new list.

Finally, Yonkers contends that it has already demonstrated that Dr. Siskin’s data are too unreliable to provide the basis for a finding of discriminatory impact in this case. It first criticizes Dr. Siskin’s use of a “one-sided” or “one-tailed” test. See, e.g., Brown v. Delta Air Lines, Inc., 522 F.Supp. 1218, 1228 n. 14 (S.D.Tex.1980); Harper, Statistics As Evidence of Age Discrimination, 32 Hast.L.J. 1347, 1355 n. 65 (1981). It points out that he has repeatedly amended the raw data on which he bases his analyses and that he admitted on cross-examination to numerous computational errors. See Transcript at 252-56, 264. And, as noted, it questions the techniques, such as aggregation, used to achieve his results, as well as their overall statistical significance. While these criticisms and others yet to be made may prove telling, they do not at this point so undermine Dr. Siskin’s data as to have effectively rebutted the prima facie case. By separate, unpublished order this court has awarded defendants the excess costs involved in evaluating and responding to Dr. Siskin’s revisions, but in the interest of justice it has admitted the final version of his report as the most comprehensive and accurate assessment of the plaintiff’s case. Yonkers and the State will have a full opportunity to rebut that case, and Dr. Siskin’s shifts in analysis will then be fair game.

The data of Dr. Siskin’s report and the other evidence relied on here do

not prove a ... disproportionate impact with complete mathematical certainty. But there is no requirement that they should____ We must not forget the limited office of the finding that [female], black and Hispanic candidates did significantly worse in the examination^] than others. That does not at all decide the case; it simply places on the defendants a burden of justification which they should not be unwilling to assume.

Vulcan Society of the New York City Fire Department, Inc. v. Civil Service Commission of the City of New York, 490 F.2d 387, 393 (2d Cir.1973) (Friendly, J.).

The defendants’ motions to dismiss are denied.

SO ORDERED.  