
    Hermino CUESTA, Joann Barbagallo, Carlos Chaluisan, Jeannette Dvailla, Christopher Garcia, Earl Hall, James Lee, Samuel Leon, Katherin Oliver, Victor Ortiz, Paul E. Parker, Josep Riffas, Henry Seltzer, Jeff Skopp, Sharon Smalls and Stephen Walters, on their own behalf and on behalf of all others similarly situated, Plaintiffs, v. The STATE OF NEW YORK OFFICE OF COURT ADMINISTRATION; Honorable Herbert B. Evans, in his capacity as State Administrative Judge of the New York State Office of Court Administration; The New York State Civil Service Commission; and Joseph A.F. Valenti, in his capacity as President of the New York State Civil Service Commission and Civil Service Commission, Defendants.
    No. 83 Civ. 3714 (PNL).
    United States District Court, S.D. New York.
    March 16, 1989.
    Gordon, Schechtman & Gordon, New York City (Murray A. Gordon, Kenneth E. Gordon, Richard Imbrogno, of counsel), for plaintiffs.
    
      Michael Colodner, Office of Court Admin., New York City (John Eisman, Ann Pfau, of counsel), for Court defendants.
    Robert Abrams, Atty. Gen. of the State of N.Y., Barbara B. Butler, Asst. Atty. Gen., New York City, for Civil Service defendants.
    Dretzin & Kauff, New York City (Raymond G. McGuire, of counsel), for defendants-intervenors.
   OPINION AND ORDER

LEVAL, District Judge.

This is an employment discrimination suit, brought under Title VII, 42 U.S.C. § 2000 et seq., challenging the validity of written examinations administered by the New York State Office of Court Administration for the filling of court officer positions.

After a trial by submissions, I held that, although plaintiffs had established a prima facie case, the defendant had successfully defended the validity of all aspects of the tests except for the use of rank-order scoring to determine job priority. I required the parties to submit additional evidence and briefing on the use of rank-order scoring and alternative methods which might be employed to establish hiring priority. I now find that the use of rank-ordered scoring in determining the order of eligibility for hiring did not render the examination illegal.

DISCUSSION

In my earlier opinion 657 F.Supp. 1084 (hereafter “Opinion”), I held that the 1982 exams had been validated by the defendants for use in selecting candidates for permanent UCO and SCO positions. I also upheld the use of a cutting score to set a pass-fail mark on the exams. The sole issue left for decision was how the test results should be used to arrange a preference order of the candidates who passed the exam. This determination is relevant only as to those displaced provisional candidates who passed the exam, but were not rehired because of their low scores.

The Guidelines provide that rank-ordering may be used if the employer can prove that “a higher score ... is likely to result in better job performance.” Guidelines § 14(c)(9). In Guardians Association of the New York City Police Department v. Civil Service Commission of the City of New York, 630 F.2d 79, 104 (1980), cert. denied, 452 U.S. 940, 101 S.Ct. 3083, 69 L.Ed.2d 954 (1981) (“Guardians IV”), the Court of Appeals strongly counselled against rank-ordering when this was likely to cause arbitrary results. Rank-ordering was found particularly unsatisfactory in that case because two-thirds of the passing scores were bunched within four points. Nonetheless, the Court made clear that rank-ordering is permissible if the employer demonstrates “that rank-ordering is sufficiently justified to be used.” Id., 630 F.2d at 103. The Court noted that a criterion-related study would be helpful. The Court went on to suggest that a job analysis and a high degree of adherence to Guideline requirements could result in a strong showing of content validation. “[T]here must be a substantial demonstration of job relatedness and representativeness____ The test maker can achieve an adequate degree of reliability by careful design of the exam so that the questions will yield a satisfactory degree of consistent results.” Methods were suggested to guard against inconsistency based on extraneous factors — stressing that exam questions “should have significant differentiating power, so that scores are not bunched____” Id., 630 F.2d at 104.

I found in my earlier opinion that OCA had conducted an exhaustive job analysis, pre-tested the examinations thoroughly, and made every deliberate effort to adhere to the principles of the Guidelines. Opinion, 657 F.Supp. at 1087-1093, 1097-1099. I also found that the tests were sufficiently job-related and tested for representative SKAPs. I was nonetheless skeptical of the validity of rank-order scoring. My concerns have been sufficiently answered.

I had expressed concern that only 10.1% of the identified SKAPs were tested on the written exam. Defendants have demonstrated, however, that because 61.7% of the SKAPs which were necessary for performing the UCO job were learned either on the job or in training, only 38.3% of the total SKAPs remained to be measured, of which number 5.6% were deemed too trivial for examination. Of the remaining 32.7%, 22.6% were appropriately tested by pass-fail standards because they pertained to meeting minimum qualifications. This left 10.1% to be tested on the graded, written examination. Defendants have made a strong showing that the tested SKAPs were performance-differentiating, that is, helped to measure a candidate’s ability to acquire the SKAPs which would be learned on the job and in training. Wagner Aff. p. 11. The written examinations were constructed with the most job-related applicability possible given the job-analysis and pre-testing that had preceded the 1982 exams. The defendants have adequately shown that its tests satisfied the Guardians requirements of job-relatedness and reliability for the use of rank-ordered scoring.

Furthermore, the manner in which the hiring was conducted softened the effect of the rank-ordering, tending in the direction of zoning. After the administration of the 1982 written exams, all of the passing candidates were placed on a list ordered by how they scored on the exam. Two lists were created, a preference-eligible list and an open-competitive eligible list. As jobs became available, the candidates would be contacted and asked to continue the application process. A candidate who was contacted would then schedule and take four separate pass-fail examinations for the remaining identified SKAPs (a medical examination, a test of physical ability, a psychiatric examination, and a background check). There was often a considerable time-period between the initial contact and the appointment to the job. Because of differing times taken by applicants to complete the process, the order of hiring often departed from the rank-order of the scores.

Over the six-year life of the eligible lists, 6,490 candidates were contacted from the open-eligible list in order to make 1,389 permanent appointments. Ferrara Aff. p. 17. Thus only 21.4% of all candidates who passed the exam and scored high enough to be contacted were actually appointed. It appears that during this process, in order to obtain a large enough pool to fill open jobs, OCA contacted broad bands of candidates within a significant scoring range on the same date. When those candidates would be finally appointed depended not on their score on the written examination, but on the speed with which they dispatched the remaining portions of the application process. Ferrara Aff. Exhibit H. Thus the process followed had the practical effect of a modified zone method.

Plaintiffs contend that it was the rank-ordering that resulted in the appointment of so few minority candidates. They present charts showing that of the total appointees off the open-eligible list, only 4.8% were black and only 9.4% were hispanic. These numbers are far less than the percentage of the number of blacks and hispanics who actually passed the written examination, which were 30% and 8.1% respectively. Gordon Aff. Table VIII. Thus plaintiffs contend that the majority of minority candidates who passed must have scored below the raw score of 62, under which no appointments were made. The argument is not as logical as plaintiffs make it sound. Because only 21.4% of those passing candidates who were contacted received appointments, it is likely that many minority candidates who were contacted either declined or flunked the remaining pass-fail qualifications.

An important aspect of judging the use of rank-ordering is to discover to what extent varying scores meaningfully measure different levels of knowledge or capability. In Guardians IV, the Court of Appeals found that the vast majority of candidates’ scores on the police exam were bunched between 94 and 97. On such a set of scores, the Court found that a system of point-by-point rank ordering would not provide a job-related basis for selecting candidates without significant arbitrariness. Guardians IV, supra, 630 F.2d at 104. The Guardians IV court found that the bunching of scores was “the most serious implication of error” in the measurement of the police exam. Guardians IV, supra, 630 F.2d at 103.

In contrast, scores on the UCO examination were widely distributed over a significant range. See PI. Trial Doc., Vol. I, Exhibit 13, pp. 4-5. For this reason, greater reliability of measurement may be suggested. Considering the careful preparation and research that went into this test, such an inference would be fair. Given the strong showing by the defendants of the pre-testing analysis conducted precisely to create exam questions with differentiating power, I am inclined to find that the defendants are able to justify the use of rank-ordered scoring by means of the two methods suggested by the Guardian IV court.

Defendants also show that rank-ordering was validated by a criterion-related validation study, providing empirical evidence of the validity of the 1982 tests. Many of the people who received CO appointments on the basis of the 1982 test also took the 1986 SCO test. Defendants show that there is a positive correlation between success on the 1982 CO examination and success on the 1986 SCO examination. Def.Mem. of Law, pp. 30-32. Plaintiffs contend that one cannot use one test to validate another. They argue that success on the 1982 exam must correlate with actual job performance. Defendant points out, however, that the SKAPs on the 1986 SCO exam are those which must be learned by court officers in their first year on the job. While it may be true that subsequent testing is not the best means of criterion-related validation, the method is specifically contemplated by the Guidelines:

[Criterion] [m]easures of relative success in training include but are not limited to ... tests. Criterion measures consisting of paper and pencil tests will be closely reviewed for job relevance. § 14(b)(3)

Although I do not rely on corroboration provided by the 1986 test, I must acknowledge that it does not favor the plaintiffs. It tends rather to support the defendants’ showing.

I said in my prior opinion that test validity is a matter of degree. Opinion, at 1100. Defendants have made a considerable showing of reliability and job-relatedness. Furthermore, the non-preference provisionals who are the plaintiffs in this case have not shown as severe an adverse impact as found in other cases to justify relief: The disparities here among the non-preference provisionals did not approach those found actionable in Guardians. Although the use of rank-order scoring may have caused some of the disparate impact, I cannot say that such impact was so significant as to invalidate the procedures. Defendants have carried its burden of demonstrating the validity of the use of rank-order scoring.

Plaintiffs have not suggested a method of ordering which is clearly preferable on these facts. The zoning suggested by plaintiffs is so broad-banded as to destroy the value of the test in distinguishing between passing candidates. Alternatively, plaintiffs propose individualized hearings, which is exactly what standardized testing seeks to avoid to reduce the likelihood of intentional or subconscious discrimination.

It must be recognized also that while zone scoring can decrease arbitrary distinctions between certain candidates, it may increase the arbitrariness of distinction between others who fall on opposite sides of the zone’s dividing point. In some instances, as in Guardians IV, it can serve an important goal. But it does not follow that zone scoring is always required, or even always helpful in avoiding arbitrariness.

CONCLUSION

The testing of job-qualifications is not susceptible to perfection. In designing this examination the defendants took extraordinary care to produce a test that would fairly measure qualifications free of both intentional and involuntary discrimination. Whether they achieved perfection is not the standard. Nor is the issue whether we can conceive of a preferable system. I find that the defendants have adequately defended the test and the method used to establish hiring priority under it.

Judgment is for the defendants. The clerk shall enter judgment.

SO ORDERED. 
      
      . Thus the defendants’ arguments imply that the written UCO examinations tested for 55% of the total SKAPs necessary for carrying out the responsibilities of the UCO. The breakdown is as follows:
      Written exam 10.1%
      Learned (Knowledges and Skills) 44.9% Learned on job (Motivational) 16.8%
      Trivial 5.6%
      Pass-fail 22.6%
      On the SCO examination, a comparable SKAP breakdown would mean that the written examination tested for 54.5% of total SKAPs: Written Exam 27.7%
      Learned (Knowledges and Skills) 26.8%
      Learned (Motivational) 23.7%
      Trivial 6.1%
      Pass-fail 15.7%
     
      
      . This discussion concerns the UCO exam primarily. Mention of the SCO exam will be made where applicable.
     
      
      . The preference-eligible list consisted of provisional CO’s who passed the examination and had been appointed prior to June 21, 1980. Ferrara Aff. p. 15-16.
     
      
      . The defendants also created a careful system of inverse displacement. The provisionals who failed the exam or who did not appear for the exam were displaced first, followed by those provisionals who scored lowest on the test. Since the selection procedure often involved contacting candidates at different score points, provisionals were given a special preference in contact. These provisionals also could be rehired more quickly because they had already passed the pass-fail tests when they were appointed as provisionals. Although this process helped avert harm to many provisionals, it also had the effect of causing those who scored lowest on the exam to wait the longest before being rehired, if they were rehired at all.
     
      
      . Plaintiffs indicate that the percentage of white candidates who scored higher than 62 was significantly higher than the minority rate. According to plaintiffs charts, 57.8% of whites scored better than 62, while only 38.3 of blacks and 38.8 of hispanics scored better than 62. The total percentage of candidates who scored higher than 62 was 49.7%. Gordon Aff. Table IV.
     
      
      . Among non-preference provisionals, the group passing rate for the two tests was 81.8%. Subgroup percentages follow:
      Whites 88.6%
      Non-Whites 71.5%
      Although the numerical impact among the plaintiff class is slight, 1 found that plaintiffs had stated a prima facie case because of the disparate results among the total candidate pool. Opinion, p. 18.
     