
    Michael E. WILLIAMS; Flora Lynn McMiller; Keith Arnett Harris; Almesha Spinkston; Gerrol Gerard Townsend (97-2049); Alan Dale; Bryan Thomas (98-1256), Plaintiffs-Appellants, v. FORD MOTOR COMPANY, Defendant-Appellee.
    Nos. 97-2049, 98-1256.
    United States Court of Appeals, Sixth Circuit.
    Argued: Dec. 18, 1998.
    Decided and Filed: Aug. 9, 1999.
    
      William H. Bartle (argued and briefed), Murray & Murray, Sandusky, Ohio, for Plaintiffs-Appellants.
    John H. Beisner (argued), William J. Stuckwiseh (briefed), Brian D. Boyle (briefed), O’Melveny & Myers, Washington, D.C., George L. Forbes, Forbes, Forbes & Associates, Cleveland, Ohio, for Defendant-Appellee.
    Before: KENNEDY and BATCHELDER, Circuit Judges; WISEMAN, District Judge.
    
    
      
       The Honorable Thomas A. Wiseman, Jr., United States District Judge for the Middle Dis-Irict of Tennessee, sitting by designation.
    
   OPINION

BATCHELDER, Circuit Judge.

I.

A. Procedural Background

This case is one of many employment discrimination cases in multidistrict litigation against Ford Motor Co. The complaint in the case before us here (hereinafter “Williams”), is a state-law employment discrimination class action claiming violations of Ohio Revised Code § 4112.02, which was originally filed by Michael Williams, Flora McMiller, Keith Harris, Almesha Spinkston, and Gerrol Townsend in Common Pleas Court in Lorain County, Ohio, and removed on diversity grounds to the U.S. District Court for the Northern District of Ohio. After Ford moved to consolidate the case with three others, Jones v. Ford Motor Co., Rose v. Ford Motor Co., and Glinton v. Ford Motor Credit Co., the multidistrict panel transferred all four of the cases to the Eastern District of Michigan. The multidistrict panel did not, however, consolidate the cases.

Shortly after the Williams plaintiffs filed a motion for class certification, Alan Dale and Bryan Thomas filed a tag-along action {“Dale”), claiming that “Ford Motor Company discriminates on the basis of race against Afro-American applicants for unskilled hourly employment through the use of its [pre-employment] test,” in violation of Ohio Revised Code § 4112.02. The district court entered an order in Williams, certifying a class consisting of “all African American applicants for unskilled hourly employment with Ford from 1989 to the present who scored low on the unskilled pre-employment test and who were thereby excluded from unskilled employment with Ford within the state of Ohio.” In its order certifying the class, the court noted that plaintiff Harris did not qualify as a class member because he had scored a “medium” on the test, and that the plaintiffs had conceded that three others of the named plaintiffs did not qualify as class members, two having never taken the test at all and one having scored a “medium.” The court held, however, that the plaintiffs in Dale are adequate class representatives. In that same order, the court denied plaintiffs’ motion to consolidate the case with Jones v. Ford Motor Co., Rose v. Ford Motor Co., and Glinton v. Ford Motor Credit Co. because “the nature of Williams is clearly different from [the other three cases].”

Ford filed a motion for summary judgment “on each and every claim asserted in plaintiffs Complaint,” which the district court granted in its entirety, dismissing the action. The court also granted Ford’s unopposed motion for judgment in the Dale tag-along action since the Dale plaintiffs were members of the Williams class. Dale and Williams were consolidated for this appeal.

B. Factual Background

Ford operates seven plants in Ohio where motor vehicles and/or their component parts are manufactured and/or assembled. Each plant employs unskilled hourly workers. Ford began using a pre-employment test referred to as the Hourly Selection System Test Battery (“HSSTB”) around 1989 to screen applicants for the unskilled hourly positions. Candidates receiving a “low” score are not allowed to proceed in the application process though these applicants may retest, except at Ford’s Sandusky plant. The HSSTB measures five areas: reading comprehension, arithmetic, parts assembly, visual speed and accuracy, and precision/manual dexterity. The manual dexterity test may be measured by an apparatus-based test or a paper and pencil test requiring precise finger/hand movement. A score above the 50th percentile on the HSSTB is classified as “high;” a score between the 25th and 50th percentile is classified as “medium.” The “low” scoring candidates, who may not proceed in the application process, are those who score below the 25th percentile.

Prior to implementation of the HSSTB, Ford utilized referrals from the Ohio Bureau of Employment Services (“OBES”); OBES administered a General Aptitude Test Battery (“GATB”) to prospective candidates. When OBES began to restrict the use of GATB in late 1988, Ford decided to develop its own test. In early 1989, Ford hired Personnel Designs, Inc., which became HRStrategies, Inc. and now operates as a division of Aon Consulting (hereinafter referred to as “HR”), to develop an interim and long-term selection test for hourly production employees. Because Ford’s hourly production employees rotate among assignments and applicants do not apply for positions within specific job classifications, Ford instructed HR to develop a single test battery for all hourly production employees at its facilities. After visiting over a dozen representative Ford facilities to collect information on the content of the Ford hourly jobs, HR developed an interim test battery that was administered in June 1989 at the Lorain and Ohio Assembly Plants. The final test battery, which differed in a number of respects from the interim test, was implemented in late 1989 as a component of Ford’s hourly hiring process.

In developing the final HSSTB, HR conducted an extensive job analysis to identify the knowledge, skill and ability requirements of Ford hourly production jobs. From December 1989 through May 1990, Ford supervisors participated in job analysis inventories; the supervisors then rated each job requirement and job ability identified in the inventories on its importance to the job category. HR assessed the reliability of the ratings and analyzed the data to identify key job requirements and abilities, and on the basis of the analysis, developed specific tests to measure the skills necessary to perform the job requirements that had been rated as “important” by the experts across all the job categories.

In 1993, Ford directed HR to conduct a criterion-related study, that is a statistical analysis of the relationship between performance on the test and performance on the job. A sample of 105 hourly employees were evaluated on their job performance by 46 selected supervisors; HR analyzed the relationship between the employees’ HSSTB scores (pre-employment) and their job performance ratings. HR reported “criterion-related validity analy-ses with data collected to date show a strong pattern of significant correlations between the tests and performance ratings made by supervisors.” Both the initial study and a supplemental study demonstrated a correlation of .30 between the test battery score and overall job performance rating; the supplemental study concluded that “[t]he obtained validity (correlation) coefficient of .30 is both statistically and practically significant and is evidence of the validity and appropriateness of the Ford Hourly Selection Battery for selecting candidates for further consideration for hourly production jobs at Ford Motor Company.”

HR also conducted a validity generalization study called “meta-analysis” in which HR examined literature on sixty-one studies to determine whether similar tests used by other employers predicted employee performance in similar jobs. The meta-analytical study was conducted from January 1995 to April 1995.

Ford’s expert, Dr. Wayne Cascio, reviewed and analyzed the test development process and HR’s studies and found the test “was developed in a manner consistent with sound professional practice” and “the totality of the evidence for the validity of the HSS supports its continued use.” By contrast, Dr. Charles Cranny, expert for the plaintiffs, found numerous problems with the test development and validation studies conducted by HR, concluding that the results of the job analysis fail to demonstrate a clear linkage of specific requirements to job duties and the criterion-related evidence from the preliminary study is inadequate to demonstrate the job-relatedness of the HSSTB.

II.

A. Standard of Review

We review de novo the district court’s grant of summary judgment. The proper inquiry on appeal from a grant of summary judgment is “whether the evidence presents a sufficient disagreement to require submission to a jury or whether it is so one-sided that one party must prevail as a matter of law.” Anderson v. Liberty Lobby, Inc., 477 U.S. 242, 251-52, 106 S.Ct. 2505, 91 L.Ed.2d 202 (1986). Summary judgment is proper only if there is no genuine issue as to any material fact, and the moving party is entitled to judgment as a matter of law. Fed.R.Civ.P. 56(c). In reviewing a grant of summary judgment, the moving party’s evidence “must be viewed in the light most favorable to the [nonmoving] party.” Matsushita Elec. Indus. Co. v. Zenith Radio Corp., 475 U.S. 574, 587, 106 S.Ct. 1348, 89 L.Ed.2d 538 (1986); Adickes v. S.H. Kress & Co., 398 U.S. 144, 157, 90 S.Ct. 1598, 26 L.Ed.2d 142 (1970). However, if the non-moving party failed to make a sufficient showing on an essential element of the case with respect to which the non-movant has the burden, the moving party is entitled to summary judgment as a matter of law. Celotex Corp. v. Catrett, 477 U.S. 317, 322-23, 106 S.Ct. 2548, 91 L.Ed.2d 265 (1986).

B. Analysis

The Williams action was initiated generally pursuant to Ohio Rev.Code Ann. § 4112.02(A) which provides:

It shall be an unlawful discriminatory practice:
(A) For any employer, because of the race, color, religion, sex, national origin, handicap, age, or ancestry of any person, to discharge without just cause, to refuse to hire, or otherwise to discriminate against that person with respect to hire, tenure, terms, conditions, or privileges of employment, or any matter directly or indirectly related to employment.

The Williams plaintiffs alleged no specific practice which would violate the statute— indeed, they alleged no specific practice of any kind — claiming only that each of them has unsuccessfully sought employment with Ford at its Lorain County facility, and that

Ford Motor Company,, due to race/color, has refused to hire them. Plaintiffs further state that other less qualified applicants, and/or others who applied after them but were white, were hired. Plaintiffs further state that they have been given no explanation for this defendant’s failure to hire them and others similarly situated in the same manner. Plaintiffs also state that this defendant has engaged in a pattern, practice and course of conduct of not hiring individuals who are black or Afro-American, and in giving preferential treatment to white applicants.

The Dale complaint, however, claims that the employment test utilized by Ford discriminates against African American applicants for unskilled hourly positions.

The Ohio courts have held that the evidentiary standards and burdens of proof applicable to a claimed violation of Title VII of the Civil Rights Act of 1964 are likewise applicable in determining whether a violation of Ohio Rev.Code § 4112 has occurred. Thus, the federal case law governing Title VII actions is generally applicable to cases involving alleged violations of Chapter 4112. See Little Forest Med. Ctr. v. Ohio Civil Rights Comm’n, 61 Ohio St.3d 607, 575 N.E.2d 1164, 1167 (Ohio 1991)( “ ‘reliable, probative, and substantial evidence’ in an employment discrimination case brought pursuant to R.C. Chapter 4112 means evidence sufficient to support a finding of discrimination under Title VII.”)(quoting Plumbers & Steamfitters Joint Apprenticeship Comm. v. Ohio Civil Rights Comm’n, 66 Ohio St.2d 192, 421 N.E.2d 128, 131 (Ohio 1981)).

In Albemarle Paper Co. v. Moody, 422 U.S. 405, 425, 95 S.Ct. 2362, 45 L.Ed.2d 280 (1975), the Supreme Court outlined the applicable burdens of the parties in test-related disparate impact cases:

Title VII forbids the use of employment tests that are discriminatory in effect unless the employer meets “the burden of showing that any given requirement [has] ... a manifest relationship to the employment in question.” This burden arises, of course, only after the complaining party or class has made out a prima facie case of discrimination, i.e., has shown that the tests in question select applicants for hire or promotion in a racial pattern significantly different from that of the pool of applicants. If an employer does then meet the burden of proving that its tests are “job related,” it remains open to the complaining party to show that other tests or selection devices, without a similarly undesirable racial effect, would also serve the employer’s legitimate interest in “efficient and trustworthy workmanship.”

Id. at 425, 95 S.Ct. 2362 (internal citations omitted). The Supreme Court made it clear in Wards Cove that once a plaintiff has made out a prima facie case, it is not the burden of persuasion which shifts to the employer; instead, at that point, the employer bears the burden of production of evidence demonstrating a business justification for the challenged employment practice. “[T]he ultimate burden of proving that discrimination against a protected group has been caused by a specific employment practice remains with the plaintiff at all times." Wards Cove Packing Co. v. Atonio, 490 U.S. 642, 659-60, 109 S.Ct. 2115, 104 L.Ed.2d 733 (1989)(empha-sis in original).

Therefore, to prove a selection procedure is job-related, the employer must show “by professionally acceptable methods, [that the test is] predictive of or significantly correlated with important elements of work behavior which comprise or are relevant to the job or jobs for which candidates are being evaluated.” Black Law Enforcement Officers Ass’n v. City of Akron, 824 F.2d 475, 480 (6th Cir.1987)(brackets in original)(internal quotation marks omitted)(quoting Albemarle Paper Co., 422 U.S. at 431, 95 S.Ct. 2362). In order “to assist employers ... to comply with requirements of Federal law prohibiting employment practices which discriminate on grounds of race, col- or, religion, sex, and national origin .... [and] to provide a framework for determining the proper use of tests and other selection procedures,” 29 C.F.R. § 1607.1(B)(1998), the Equal Employment Opportunity Commission (“EEOC”) developed the Uniform Guidelines, 29 C.F.R. § 1607.1-.16, a set of standards for establishing the validity of these tests -and procedures. A test which is shown to have an adverse impact on hiring of members of any race or ethnic group is deemed “discriminatory” unless it is “validated” in accordance with the guidelines. 29 C.F.R. § 1607.3(A); Gonzales v. Galvin, 151 F.3d 526, 529 n. 4 (6th Cir.1998). See also Police Officers for Equal Rights v. City of Columbus, 644 F.Supp. 393, 414 (S.D.Ohio 1985)(noting a test may be validated, or shown tó be job-related, if the employer demonstrates compliance with the validation procedures set forth in the Uniform Guidelines and the accepted standards of the psychological profession).

Employers may use three types of studies to validate an employee selection procedure under the Guidelines: content, construct, or criterion-related validity studies. 29 C.F.R. § 1607.5(A). See also Zamlen v. City of Cleveland, 906 F.2d 209, 218 (6th Cir.1990). The Guidelines also set forth specific documentary requirements for each type of study. § 1607.15; see also Gonzales v. Galvin, 151 F.3d at 532.

A test will have content validity if there is a direct relationship between the test contents and the job contents. Police Officers for Equal Rights v. City of Columbus, 644 F.Supp. at 414. Content validity may be demonstrated through data showing that the skills tested are representative of skills important to the job for which the applicants are to be evaluated.

A content validity study is most appropriate when test items directly measure abilities that are prerequisites to entry-level job performance (for example, a shorthand or typing test for a secretarial position). Content validity studies must include a thorough job analysis identifying the most important knowledge, skills, and abilities necessary to successful job performance.

Zamlen, 906 F.2d at 218 (citing Guardians Ass'n of New York City Police Dep't, Inc. v. Civil Service Comm'n of New York, 633 F.2d 232, 242 (2d Cir.1980)). It is not appropriate, however, for an employer to test for those skills that can readily be learned on the job. 29 C.F.R. §§ 1607.5(B), (F). Zamlen, 906 F.2d at 218.

A construct validity study evaluates procedures that test for abstract qualities which may be important characteristics for proper job performance but are challenging to test. Such studies are appropriate where the necessary qualities, such as creativity, cannot be measured directly; construct validity studies are usually not appropriate for validating physical exams because these physical qualities and abilities may be readily observed and quantified. Zamlen, 906 F.2d at 218.

A criterion-related study evaluates whether performance on the test adequately correlates with performance of the job. Gillespie v. Wisconsin, 771 F.2d 1035, 1040 n. 3 (7th Cir.1985). The guidelines provide that a criterion-related study "should consist of empirical data demonstrating that the selection procedure is predictive of or significantly correlated with important elements of job performance." 29 C.F.R. § 1607.5(B). An employment test has criterion-related validity when the data demonstrates a significant positive correlation between degree of success on the test and the degree of success in some measure of job performance. This relationship is expressed as a "correlation coefficient." A correlation coefficient of - 1.0 indicates a completely negative relationship: the better one does on the test, the worse one performs on the job. A correlation coefficient of + 1.0 indicates a complete identity between relative test scores and relative job performance. Clady v. County of Los Angeles, 770 F.2d 1421, 1426 n. 5 (9th Cir.1985), citing B. Schlei & P. Grossman, Employment Discrimination Law, 114 (2d ed.1983); see also Hamer v. Atlanta, 872 F.2d 1521, 1524-26 (11th Cir.1989)(discussing validation of employment exams, stating "Practical significance is the degree to which test scores relate to job performance and is measured by a `correlation coefficient.'. . . `the degree of correlation between two variables (e.g., entrance exam scores and subsequent school grades) is expressed as a "correlation coefficient" on a scale running from + 1.0 to -1.0' ")(quoting Ensley Branch of NAACP v. Seibels, 616 F.2d 812, 817 n. 13 (5th Cir.1980)).

To comply with the Uniform Guidelines, all types of validation studies must conform to several technical requirements; the minimum standards are set forth at 29 C.F.R. § 1607.14. The guidelines specifically state that “[u]nder no circumstances will the general reputation of a test ... its author ... or causal reports of it’s [sic] validity be accepted in lieu of evidence of validity. Specifically ruled out are ... nonempirical or anecdotal accounts of selection practices or selection outcomes.” 29 C.F.R. § 1607.9(A). “Professional supervision of selection activities is encouraged but is not a substitute for documented evidence of validity.” 29 C.F.R. § 1607.9(B).

In its motion for summary judgment, Ford focused on two issues: (1) whether plaintiffs stated a disparate impact claim under Ohio law for race-based employment discrimination where the undisputed facts demonstrate that Ford has hired African Americans at levels exceeding the level at which African Americans are represented in the relevant occupation in the local labor force and (2) whether Ford is entitled to summary judgment on plaintiffs’ test-based disparate impact claim where the undisputed facts demonstrate that the test is valid and job-related and that there is no equally valid test available that would produce less disparate impact. Plaintiffs opposed the motion, relying principally on the affidavit of Dr. Charles Cranny who asserted that Ford’s pre-employment test was not properly validated.

Treating Ford’s first claim as a claim that the plaintiffs had failed to make out a prima facie case of test-based discrimination, the district court held that Ford was not entitled to summary judgment on that basis. First, the court concluded that, while it was undisputed that more African Americans took the HSSTB than their numbers in the relevant labor market would predict, the plaintiffs’ claim was not that insufficient numbers of African Americans took the test but that African Americans failed or scored “low” on the test in disproportionately high numbers when compared to whites, a fact that is also undisputed. Ford had not presented evidence to support its claim that the plaintiffs’ statistical evidence or their analysis of that evidence was flawed, and the district court held that plaintiffs’ statistical evidence was sufficient to withstand Ford’s claim that plaintiffs had failed to establish a prima facie case of disparate impact discrimination. The district court also concluded that, contrary to Ford’s contention, the “bottom line” rule of Connecticut v. Teal, 457 U.S. 440, 102 S.Ct. 2525, 73 L.Ed.2d 130 (1982) would be applicable in this state-law class action; the court therefore refused to grant Ford summary judgment based on its claim that because this is a class action, the undisputed evidence that Ford hires more African Americans in Ohio than would be predicted by their numbers in the relevant labor pool precludes the plaintiffs from proceeding.

The district court then turned to Ford’s claim that it is entitled to summary judgment on plaintiffs’ test-based disparate impact claim because it is undisputed that the test has been properly validated and there has been no showing of an alternative test that is both equally valid and less adverse in its impact on African Americans than the HSSTB. Noting that although the burden of persuasion continued with the plaintiffs to demonstrate that the test fails to serve in a significant way the legitimate goals of the employer, the court examined the record evidence, including the “Project Technical Report” detailing the technical research and validation data supporting the HSSTB and the reports of Ford’s independent experts evaluating that data, to determine whether Ford had produced evidence of a legitimate business justification for its use of the test. The court concluded that the HSSTB had been developed by professionals and appropriately validated through content-related studies and criterion-related studies. Further, the court concluded that Ford had demonstrated that the job analysis was not race-biased, and had met its burden of showing that the HSSTB is both job-related and a valid predictor of the likelihood of success on the job. The court held that the plaintiffs had produced the report of an expert who disagreed with Ford’s reports and experts, but had failed to produce any independent validation study or any analysis sufficient to create a genuine issue as to whether Ford’s validity testing was flawed. Plaintiffs’ anecdotal evidence that individuals who had scored “low” on the HSSTB had nonetheless succeeded in other unskilled industrial jobs was similarly insufficient to create a genuine issue of material fact. Finding that Ford had shown to a reasonable degree of statistical certainty that its test is a legitimate and valid indicator of future performance on the job, the district court held that Ford was entitled to summary judgment because, although the test does disparately impact African American test-takers, it does not violate Ohio Rev.Code § 4112.02.

On appeal, plaintiffs argue that the district court erred in granting summary judgment to Ford because in so doing, the district court ignored the fact that plaintiffs presented expert testimony that conflicted with Ford’s expert testimony and weighed the credibility of the opposing experts; the district court erroneously found that the test has both content validity and criterion-related validity which is not supported by the evidence and in any event is a determination to be made by the trier of fact; and the district court applied an incorrect legal standard.

We decline to determine whether the Ohio Supreme Court would adopt the reasoning of Connecticut v. Teal, 457 U.S. 440, 102 S.Ct. 2525, 73 L.Ed.2d 130 (1982) in this class action, instead assuming for purposes of this appeal that the plaintiffs were successful in establishing a prima facie case of disparate impact discrimination. For the reasons that follow, we hold that the district court correctly concluded that Ford was entitled to summary judgment.

1. Content Validity of the HSSTB

The district court correctly concluded that the HSSTB is content valid. The Uniform Guidelines state that “[a] selection procedure can be supported by a content validity strategy to the extent that it is a representative sample of the content of the job.” 29 C.F.R. § 1607.14(C)(1). We conclude that content-validity is an appropriate method of validation of the test at issue here, and that Ford'has demonstrated that the test has such validity.

The record demonstrates that Ford utilized HRStrategies, a professional test developer, to conduct a job analysis that complied with the technical standards contained in the Guidelines. HR began with Ford’s requirement that applicants for employment in hourly positions are not hired for a specific position but, depending upon such variables as production requirements and other employees’ exercise of seniority rights under the collective bargaining agreement, must be able to rotate among numerous job classifications. Using the Dictionary of Occupational Titles as a guide, HR assigned each of Ford’s roughly 1000 hourly positions into one of 13 categories; HR then utilized 238 Ford supervisors at 18 separate Ford locations, 211 of whom had at least 10 years tenure at Ford and 125 of whom had at least 10 years in their current jobs at Ford, to rate each of the job activities contained in the respective categories, first, as to the extent to which each activity was part of the jobs within the category, and second, as to the importance of each activity to the jobs within the category. These raters were further presented with a list of 59 different job requirements, identical across all jobs, and asked to rate each requirement as to its contemporaneous importance to the job and its anticipated importance five years hence. HR then performed overlap analy-ses of those ratings to determine the extent to which there was agreement across raters, examining the overlap among raters in the same Ford plant rating the same job, the overlap among raters in different plants of the same type with regard to each job group, and the overlap across different plant types for each job category and across the 13 categories. Finally, HR computed a job requirement profile for each job category by averaging the job requirement ratings for all of the experts in the respective category and, after performing a series of cluster analy-ses to determine which categories of jobs should be combined, resolved the 13 categories into three rationally based clusters: jobs related to machining/finishing'testing operations and processes; jobs related to off-line support for manufacturing processes; and jobs directly related to the assembly line and the assembly process. HR then determined which job requirements had a mean importance rating of 2 or higher, indicating that the requirement was important — both now and 5 years hence — to the jobs in the respective cluster, and which skill requirements were important across the clusters. From those findings, HR determined which skills would be assessed by the test to be administered by Ford.

Against this evidence, the plaintiffs presented the report and affidavits of their expert, Dr. Cranny. The district court found, and we agree, that Dr. Cranny’s opinions are wholly insufficient to create a genuine issue with regard to the content validity of the HSSTB. Dr. Cranny’s criticisms of the content-validity study are entirely conclusory, are unsupported by any specific data, and are premised on the unsupportable factual assertion that the data was collected after the test had been administered.

The district court cited Merit Motors, Inc. v. Chrysler Corp., 569 F.2d 666, 673 (D.C.Cir.1977) and Hayes v. Douglas Dynamics, Inc., 8 F.3d 88, 92 (1st Cir.1993) in holding that the existence of opposing experts does not foreclose summary judgment. Ford cites both Hayes and Mid-State Fertilizer Co. v. Exchange National Bank, 877 F.2d 1333, 1339 (7th Cir.1989) for the proposition that an expert witness with a conflicting opinion does not provide a free pass to trial. In Hayes, plaintiffs produced the affidavits of a number of expert witnesses and argued that, under Federal Rule of Evidence 705, the court must accept the conclusions of these experts. The First Circuit ruled that “while nonmovants may rely on the affidavits of experts in order to defeat a motion for summary judgment, such evidence must still meet the standards of Rule 56. Fed. R.Civ.P. 56(e) requires that the nonmoving party ‘set forth specific facts showing that there is a genuine issue for trial,’ ” and noted that “[although expert testimony may be more inferential than that of fact witnesses, in order to defeat a motion for summary judgment an expert opinion must be more than a conclusory assertion about ultimate legal issues.” 8 F.3d at 92 (citing Bowen v. Manchester, 966 F.2d 13, n. 16 (1st Cir.1992); Moody v. Boston & Maine Corp., 921 F.2d 1, 5 (1st Cir.1990); Merit Motors, Inc. v. Chrysler Corp., 569 F.2d 666, 673 (D.C.Cir.1977) (“The evidentiary rules regarding expert testimony at trial were ‘not intended ... to make summary judgment impossible whenever a party has produced an expert to support its position.’ ”)). Thus, the First Circuit held

We are not willing to allow the reliance on a bare ultimate expert conclusion to become a free pass to trial every time that a conflict of fact is based on expert testimony....
Where an expert presents “nothing but conclusions — no facts, no hint of an inferential process, no discussion of hypotheses considered and rejected”, such testimony will be insufficient to defeat a motion for summary judgment. Although an expert affidavit need not include details about all of the raw data used to produce a conclusion, or about scientific or other specialized input which might be confusing to a lay person, it must at least include the factual basis and the process of reasoning which makes the conclusion viable in order to defeat a motion for summary judgment. We find that each of the expert affidavits failed to contain sufficient support for the conclusion that the injury to [plaintiff] was caused by the defendant’s product.

8 F.3d 88, 92 (internal citation omitted).

This circuit reached a similar conclusion in Monks v. General Electric Co., 919 F.2d 1189, 1192 (6th Cir.1990). The court initially pointed out that “To the extent Merit Motors articulates any type of standard, it is that affidavits by experts who have not familiarized themselves with the record are insufficient to withstand a summary judgment motion.” The court went on to find that because the expert’s affidavit offered by the plaintiff “would under no circumstances be sufficient, by itself, to establish an element of plaintiffs case, we conclude that summary judgment for defendants was properly granted.” Id. at 1193.

In challenging Ford’s motion for summary judgment the plaintiffs are not required to prove their claims, but they are required to present enough specific facts as to raise a genuine issue material to the relationship between the HSSTB and the job activities and requirements of Ford’s hourly production jobs. Plaintiffs cannot challenge the motion for summary judgment by relying on allegations contained in their complaint or on affidavits that merely state conclusory allegations. Fed.R.Civ.P. 56. Ford correctly asserts and the district court correctly found that Dr. Cranny’s affidavit did not raise a genuine issue of fact material to the content validity of the HSSTB.

The plaintiffs claim that Ford did hire some applicants who scored “low” on the test, and that this negates the test’s content validity. Plaintiffs point to no authority to support this claim, and they presented no evidence to counter Ford’s testimony that any such hires were in fact “rehires” — individuals who had once worked at Ford and who were rehired only pursuant to an agreement with the union in spite of their poor performance on the HSSTB. Finally, Plaintiffs presented no alternative test or selection device whatsoever, much less one that lacks the adverse impact of the HSSTB and that would serve Ford’s legitimate interest in hiring adequately skilled workers.

Accordingly, we conclude that Ford has sustained its burden of showing that the HSSTB is manifestly job-related; that is, Ford has demonstrated that “the content of the selection procedure is representative of important aspects of performance on the job for which the candidates are to be evaluated.” 29 C.F.R. § 1607.5B. Plaintiffs have wholly failed to present any evidence to raise a genuine issue of fact with regard to the content validity of the test or to raise a genuine issue with regard to whether Ford’s use of the test is pretextual.

2. Criterion-related validity of the HSSTB

Having concluded that Ford had demonstrated that the HSSTB is content valid, the district court went on to determine whether the test had criterion-related validity. Neither the case law nor the Uniform Guidelines purports to require that an employer must demonstrate validity using more than one method. See 29 C.F.R. § 1607.5(A) (“For the purposes of satisfying these guidelines, users may rely upon criterion-related validity studies, content validity studies or construct validity studies”)(emphasis added); § 1607.14(C)(1) (“Users choosing to validate a selection procedure by a content validity strategy should determine whether it is appropriate to conduct such a study in the particular employment context.”). Ford appears to concede that, under 29 C.F.R. § 1607.14(C), to the extent that the HSSTB purports to assess or to draw inferences about “mental processes,” the test must be shown to have validity beyond mere content validity. That section says that “a content strategy is not appropriate for demonstrating the validity of selection procedures which purport to measure traits or constructs, such as intelligence, aptitude, personality, commonsense, judgment, leadership, and spatial ability.” 29 C.F.R. § 1607.14(C)(1). Further, § 1607.14(C)(4) says that “[i]n the case of a selection procedure measuring a knowledge, skill, or ability, the knowledge, skill, or ability being measured should be operationally defined.... As the content of the selection procedure less resembles a work behavior, or the setting and manner of the administration of the selection procedure less resemble the work situation, or the result less resembles a work product, the less likely the selection procedure is to be content valid, and the greater the need for other evidence of validity.” Although plaintiffs have not pointed with any specificity to the traits, constructs, skills, know-ledges or abilities for which the HSSTB purports to test, we will accept as the basis for the claim that this test requires a showing of criterion-related validity the fact that the HSSTB does test in such areas as reading comprehension and arithmetic ability. The district court found that Ford had adequately demonstrated that the HSSTB has criterion-related validity. For the reasons that follow, we conclude that the district court did not err in that conclusion.

Ford’s criterion-related study was conducted by the same professional test developer, HRStrategies. HR selected 105 employees, representing four different Ford plants, to participate in the study. These employees had been on the job for three months, were white (53%), African American (38%) and Hispanic (9%), male and female, under and over 40 years old, and had varying degrees of education. HR then utilized 46 Ford supervisors, identified by local Ford personnel as being knowledgeable about the job performances of the employees participating in the study. The supervisors were carefully educated about the study, its purpose and its content; they were required to supply extensive demographic information about themselves; they were instructed in detail on how they were to evaluate the performance of the employees with whom they were familiar and given the procedure to use in the event they were uncomfortable rating any particular employee. The supervisors then rated the job performance of their respective assigned employees on a defined scale for each item within each of eight job requirement dimensions being evaluated, and provided as well a rating of overall performance within each dimension and an overall rating of job performance across dimensions. The job performances of these 105 employees and their HSSTB scores (which were neither revealed nor available to the supervisors evaluating the job performances) were analyzed to examine the relationship between test score and job performance.

The statistical analysis performed by HR showed that the supervisors rating job performance did not rate white employees and African American employees differently; the ratings did not differ based on the race of the rater; raters did not rate study participants of the same race systematically higher than they rated study participants of a different race. Further, the statistical analysis showed a correlation coefficient — the correlation between test performance and job performance — of .30. It is undisputed that a correlation coefficient of .30 is statistically significant and sufficient to establish job-relatedness.

Plaintiffs again presented little more than conclusory and factually unsupported or legally incorrect criticisms of the criterion-related study, including that the study was flawed because it did not include persons who scored “low” on the HSSTB; that several individuals who had scored “low” on the test had succeeded in unskilled positions with employers other than Ford; that HR had combined two samples of employees in computing an overall correlation coefficient for the full test battery, which requires the use of assumptions of which the Guidelines disapprove; and that the study was flawed because it was conducted after the HSSTB had been implemented as a selection procedure at Ford.

We conclude that the district court was correct in its determination that plaintiffs simply failed to provide the kind of specifics necessary to raise a genuine issue with regard to whether the HSSTB has criterion-related validity. The law does not require that an employer, simply in order that low scorers may be included in validation studies, hire individuals who do not pass a pre-employment test. See Clady v. County of Los Angeles, 770 F.2d 1421, 1431 (9th Cir.1985) (rejecting the contention that a validity study must include individuals who failed the challenged test); see also 29 C.F.R. § 1607.14B(1) (“These guidelines do not require a user to hire or promote persons for the purpose of making it possible to conduct a criterion-related study.”) Neither does the law require that a criterion-related study be performed before the test is utilized. See Albemarle Paper Co., 422 U.S. at 433, n. 32, 95 S.Ct. 2362 (noting only that the study in that case had been prepared immediately before trial, and studies “closely controlled by an interested party in litigation must be examined with great care.”). Furthermore, there is no evidence in the record to support plaintiffs’ claim that this study was prepared for purposes of this litigation. At most, there is the statement of Ford’s Robert Lorenzo that he ordered the April 1995 HRStrategies Project Technical Report, which compiled into report form and summarized all of the studies that had been undertaken beginning in 1989, with an eye toward potential litigation.

Plaintiffs did present the affidavits of several individuals who scored “low” on the HSSTB but who were ultimately successful in working unskilled jobs in the employ of companies other than Ford. Plaintiffs point us to no authority, however, and we have found none, to support their position that such affidavits are sufficient to create a genuine issue with regard to the test’s content validity, i.e., the statistically significant correlation between the specific skills assessed by the test and the specific skills important to job performance.

The evidence is undisputed that in the initial criterion study, roughly half of the participating employees had taken one form of manual dexterity test in their pre-employment test and the other half had taken another form of manual dexterity test. For that reason, the two sub-samples were combined and the correlation coefficient was .30. Although plaintiffs complain that this procedure required the use of assumptions that are disapproved by the Guidelines, plaintiffs make no attempt to provide any evidence to support that claim; the expert evidence provided by Ford, however, explicitly refutes any such claim.

We hold that the district court did not err in concluding that Ford had sustained its burden of demonstrating that the HSSTB has criterion-related validity, that is, Ford has demonstrated through empirical data “that the selection procedure is predictive of or significantly correlated with important elements of job performance.” 29 C.F.R. § 1607.5(B). Plaintiffs have failed to raise any genuine issue for trial with regard to the criterion-related validity of the test. Again, plaintiffs wholly failed to present any alternative selection device which might be evidence that Ford’s use of the HSSTB was pretextual. Accordingly, we affirm the district court’s judgment that the HSSTB, although it may have a disparate impact on African American test-takers, does not violate Ohio Rev.Code § 4112.02.

3. Individual plaintiffs’ claims

Finally, plaintiffs argue that the district court erred in granting summary judgment on the individual plaintiffs’ claims of discrimination. We disagree. This action was initially filed as a class action claiming that Ford had engaged in an unspecified pattern and practice of discrimination in hiring against African Americans; a subsequent tag-along action raising only allegations of hiring discrimination because of the use of the pre-employment test, was filed by plaintiffs Dale and Thomas. The district court certified a class that excluded Plaintiff Harris, one of the named plaintiffs in the initial class action complaint. The parties have conceded that two others of the named plaintiffs in the initial complaint are also not properly members of the class. In its certification order, however, the district court held that the two tag-along plaintiffs were adequate representatives of the class. When Ford moved for summary judgment, it demanded judgment on all claims in the complaint. Plaintiffs presented no evidence relative to claims of discrimination against individual plaintiffs in opposition to the motion for summary judgment; plaintiffs did not oppose Ford’s motion for judgment in the tag-along action after the district court granted Ford’s motion for summary judgment in the class action and dismissed that action; and plaintiffs did not file any post-judgment motion bringing to the district court’s attention their contention that the judgment was overbroad.

We find no error in the grant of summary judgment and dismissal of the entire case. Although the plaintiffs contend on appeal that the “claims of Williams, McMiller, and Spinkston raise the issue of racial discrimination in the selection procedure used by Ford at its Lorain, Ohio plant in choosing who would get to take Ford’s HSS test battery,” we are hard-pressed to find any such specific individual claim raised in the class action complaint, and the tag-along complaint raises only the adverse impact caused by the challenged test. Ford responds to plaintiffs’ claim by noting that it proffered evidence in support of its motion for summary judgment demonstrating that 18.3 percent of the test takers at Lorain Assembly were African American, though African Americans comprised only 8.7 percent of the relevant labor force in Ohio, Ford met its initial burden of demonstrating the absence of a genuine issue of material fact regarding the claims of Williams, McMiller, and Spinkston, and plaintiffs’ failure to adduce competent evidence to support the individual claims rendered summary judgment appropriate as to the individual claims. We note in passing that before the district court, the plaintiffs did not dispute the fact that African Americans took Ford’s pre-employment test in numbers disproportionate to their representation in the relevant labor market.

The plaintiffs indeed failed to show any discrepancy between the composition of the pool of actual test-takers and the composition of the pool of candidates for testing in the relevant labor market as is required to make out a prima facie case of adverse impact under Ohio law under Little Forest Medical Center v. Ohio Civil Rights Comm’n, 61 Ohio St.3d 607, 575 N.E.2d 1164, 1168 (Ohio 1991). The statistics in the record show an overrepresentation of African Americans among Ford’s test-takers. We conclude that the district court properly granted summary judgment on the adverse impact claims of the individual plaintiffs.

C. Conclusion

For the reasons stated above, we affirm in its entirety the judgment of the district court granting summary judgment to the defendant. 
      
      . Despite his "medium” score, Harris was told he failed the test and was denied employment at Ford's Avon Lake plant because that plant hired only "high” scoring applicants.
     
      
      . For ease of reference and to avoid confusion, we will refer to the consulting firm simply as HR, regardless of which name it was operating under at the particular stage of test development.
     
      
      .The final HSSTB reportedly was 25 percent shorter than the interim test, the test components were not identical and calculations for the two batteries differed.
     
      
      . The Williams complaint specifically refers to Ohio Rev.Code § 4112.02(E) as well, but nothing in the facts of the complaint implicates any provision in that subsection.
     
      
      . As noted by the Second Circuit in Guardians Ass’n of New York City Police Department, Inc. v. Civil Service Comm'n of New York, 630 F.2d 79, 91 (2nd Cir.1980), the Supreme Court has relied upon some of the Guidelines in several of the leading cases. See Albemarle, 422 U.S. at 431, 95 S.Ct. 2362, Espinoza v. Farah Mfg. Co., 414 U.S. 86, 94, 94 S.Ct. 334, 38 L.Ed.2d 287 (1973); Griggs v. Duke Power Co., 401 U.S. 424, 433-34, 91 S.Ct. 849, 28 L.Ed.2d 158 (1971). However, the Court has not ruled that every deviation from any of the Guidelines automatically results in a violation of Title VII.
      The Court appears to have applied the Guidelines only to the extent that they are useful, in the particular setting of the case under consideration, for advancing the basic purposes of Title VII. See Espinoza, 414 U.S. at 94, 94 S.Ct. 334.... To the extent that the Guidelines reflect expert, but nonjudicial opinion, they must be applied by courts with the same combination of deference and wariness that characterizes the proper use of expert opinion in general. See Albemarle, 422 U.S. at 449, 95 S.Ct. 2362 (Blackmun, J., concurring) (Guidelines "have never been subjected to the test of adversary comment. Nor are the theories on which the Guidelines are based beyond dispute.”) Thus, the Guidelines should always be considered, but they should not be regarded as conclusive unless reason and statutory interpretation support their conclusions.
      630 F.2d at 91.
      The Second Circuit went on to note that although the Guidelines also possess limited legal force in addition to their value as the expression of expert opinion, "[t]he primary purpose of the Guidelines is to indicate the standards that various Federal agencies ... are to use in enforcing Title VII and related statutes. But the fact that an agency ... has announced the standards they will use does not convert those standards into mandatory legal rules.” 630 F.2d at 91 (internal citation omitted).
     
      
      . 29 C.F.R. § 1607.5(F) states:
      Caution against selection on basis of knowledge, skills, or ability learned in brief orientation period. In general, users should avoid making employment decisions on the basis of measures of knowledges, skills, or abilities which are normally learned in a brief orientation period, and which have an adverse impact.
     
      
      . Sixteen percent of these supervisors were African American; Census Bureau data demonstrates that 6.7% of manufacturing supervisors in the labor force as a whole are African American. Analysis of the results of the ratings by white and African American supervisors demonstrates no statistically significant differences between them.
     
      
      . The median agreement on which tasks were part of the job was 85%; the median as to the importance of job activities was 88%; the median as to the importance of job requirements was 69%.
     
      
      . The median agreement on which tasks were part of the job was 87%; for activity importance — 95%; for job requirement importance — 70%.
     
      
      .The median level of agreement on the importance of job requirements across job categories was 68%.
     
      
      . Of these, 27.1% were African American.
     
      
      . These job requirement dimensions came from the job analysis whose results formed the basis for the development of the HSSTB. That job analysis identified 59 job requirements, which comprised nine job requirement dimensions. Eight of those — (l) solving problems; (2) learning and applying knowledge; (3) dealing with others; (4) dealing with communications; (5) working productively; (6) maintaining quality standards; (7) work orientation; and (8) physical capabilities — were chosen by HR lor use in the criterion study.
     
      
      . Specifically, the HR study found that the .30 correlation coefficient was statistically significant at the 0.01 level, which means that there is less than one chance in one hundred that the .30 correlation coefficient was the product of chance. The Uniform Guidelines provide that "a selection procedure is considered related to the criterion ... when the relationship between performance on the procedure and performance on the criterion measure is statistically significant at the 0.05 level of significance, which means that it is sufficiently high as to have a probability of no more than one (1) in twenty (20) to have occurred by chance.” 29 C.F.R. § 1607.14(B)(5).
     
      
      . The district court erred in holding that even if Ford had failed to sustain its burden of demonstrating that the HSSTB is manifestly job-related, plaintiffs were required to present an alternative selection procedure that would meet Ford's legitimate hiring needs, and having failed to do so, plaintiffs could not withstand Ford's motion for summary judgment. Albemarle Paper Co., 422 U.S. at 431, 95 S.Ct. 2362, makes it clear that both the Guidelines and Griggs v. Duke Power Co., 401 U.S. 424, 91 S.Ct. 849, 28 L.Ed.2d 158 (1971) forbid the use of discriminatory tests unless those tests are appropriately demonstrated to be job-related. In light of our holding that Ford did not fail to sustain its burden of demonstrating both content and criterion-related validity, and that plaintiffs had wholly failed to raise any evidence of pretext, this error is harmless.
     
      
      . At oral argument before this court, plaintiffs complained that the district court had restricted their briefing to 20 pages, and this was simply insufficient to include any mention of the individual claims. If the plaintiffs believed that they needed additional pages to brief the individual claims before the district court, they should have raised that contention before that court.
     