
    Edna H. SOBEL, M.D., and Bella C. Clutario, M.D., on behalf of themselves and other professional faculty members employed by the defendant, Yeshiva University, Plaintiffs, Edna H. Sobel, M.D., on behalf of herself and other professional faculty members employed by the defendant, Yeshiva University, Plaintiff-Appellant, Equal Employment Opportunity Commission, Plaintiff-Intervenor, v. YESHIVA UNIVERSITY, Defendant-Appellee.
    No. 204, Docket 87-7373.
    United States Court of Appeals, Second Circuit.
    Argued Oct. 14, 1987.
    Decided Feb. 4, 1988.
    
      Eleanor Jackson Piel, New York City, for plaintiff-appellant.
    Daniel Riesel, New York City (Lawrence R. Sandak, Robert R. Reed, Sive, Paget & Riesel, New York City, of counsel), for defendant-appellee.
    Before KEARSE, PIERCE, and PRATT, Circuit Judges.
   GEORGE C. PRATT, Circuit Judge:

In his masterpiece Bleak House, Charles Dickens painted a scathing portrait of the hopeless complexity of the handling of cases in England’s High Court of Chancery. Dickens wrote, “[Tjhrough years and years, and lives and lives, everything goes on, constantly beginning over and over again, and nothing ever ends. And we can’t get out of the suit on any terms, for we are made parties to it * * Mindful of the example of the never-ending litigation that marked Dickens’s Chancery Court, it is with regret that we find it necessary to once again remand this nearly thirteen-year old action to the district court, for new proceedings which we can only hope will at last end the litigation between these parties.

This is a complicated sex discrimination class-action suit against Yeshiva University. The core complaint alleges that Yeshiva discriminated against women faculty members at its medical school, the Albert Einstein College of Medicine (“AECOM”), by paying them a lower salary on the basis of their gender in violation of Title VII of the Civil Rights Act of 1964, 42 U.S.C. § 2000e et seq. The alleged violations run from 1974, the earliest date for which the statute of limitations had not run at the time plaintiffs filed suit in 1975, to 1979.

After seven years of discovery, trial began in September 1982, with plaintiffs’ case being carried for the most part by the intervenor, the Equal Employment Opportunity Commission (“EEOC”). After approximately three weeks of trial, the district judge took the matter under advisement, rendering his decision in June 1983, when he determined that plaintiffs had failed to establish a prima facie case of disparate treatment in faculty salaries, that their claim of disparate impact was procedurally barred, and that defendant’s pension plan, based on sex-segregated mortality tables, was illegal. Sobel v. Yeshiva University, 566 F.Supp. 1166 (S.D.N.Y.1983) (“Sobel I”). The last of these findings is not relevant to this appeal.

On appeal, this court remanded the case for reconsideration in light of the Supreme Court’s intervening decision in Bazemore v. Friday, 478 U.S. 385, 106 S.Ct. 3000, 92 L.Ed.2d 315 (1986). Sobel v. Yeshiva University, 797 F.2d 1478 (2d Cir.1986) (“Sobel II”). On remand, the district court adhered to its initial decision, concluding that the procedural bar it had initially found to preclude plaintiffs from raising their “disparate impact” claim was a sufficient independent basis for its original decision, Bazemore and this court’s remand order notwithstanding. Sobel v. Yeshiva University, 656 F.Supp. 587 (S.D.N.Y.1987) (“Sobel III”). Because we conclude that there was no “procedural bar”, and that plaintiffs' Bazemore claim, whether it is characterized as one of “disparate treatment” or “disparate impact”, deserves full and fair evaluation, we again remand, but this time for a new trial on that claim. For reasons discussed below, we direct that the case be reassigned to a different district judge.

BACKGROUND

In 1972, named-plaintiff Dr. Edna Sobel, a faculty member in the pediatrics department at AECOM, began to express dissatisfaction with her salary, which she contended was too low for a full professor of her experience and stature. She complained to her then department head, Dr. Lewis Fraad, and for the next two years received so-called “out-of-guideline” increases in salary. The second of these increases, however, was not sufficiently large to satisfy Dr. Sobel, and she initiated this lawsuit, having told Dr. Fraad that “It seems that I am going to have to sue to get appropriate pay.” Trial Tr. at 1144-51.

The basic structure of AECOM’s faculty and salary systems are set forth in Sobel I, with which we assume familiarity. See Sobel I, 566 F.Supp. at 1169-73. By the time of trial, Sobel had been joined by Dr, Bella Clutario as named plaintiffs. The class they represented had been reduced to full-time female faculty members with M.D. degrees employed by AECOM between 1974 and 1979, a group that numbered from 49 to 60 during each year out of a total full-time faculty that ranged from 204 to 295 during the period.

The salary received by any one faculty member was a result of numerous factors, some readily quantifiable and some inherently amorphous. The parties agreed on the importance of such factors as experience, numbers of publications in scholarly journals, and the department in which the doctor worked. They further agreed that the rank held by a particular faculty member (assistant professor, associate professor, or full professor) was a factor influencing salary, although plaintiffs raised strong objections to its inclusion on grounds we will soon discuss.

The nub of the dispute, of course, was whether plaintiffs adequately established that when such legitimate factors were accounted for in the admitted disparity between the average salaries of male and female faculty members, there remained a difference that could be explained only by reference to the person’s sex.

Plaintiffs attempted to demonstrate this proposition using a common statistical tool, multiple regression analysis, which is designed to isolate the influence of one particular factor — here, sex — on a dependent variable — here, salary. One of plaintiffs’ expert witnesses, Dr. Orley Ashenfelter, testified that the plaintiffs’ statistical model was designed to approximate the factors that influenced salary at AECOM, so that “any differences between the salaries of men and women that were not explained by the pertinent variables to be used in the model had to be the result of sex discrimination.” Sobel I, 566 F.Supp. at 1174. As with any multiple regression analysis, the validity of the influence attributed to a particular variable will depend heavily on how accurately the model mimics the actual factors influencing the dependent variable, salary. For example, if the model omits an important variable that affects salaries, the portion explained by that variable will seem to be unexplained, and thus may erroneously be attributed to sex. Conversely, if an extraneous factor is erroneously credited with influencing salary, it may serve to mask the effect of sex on faculty compensation.

After considerable wrangling, the parties were able to agree on virtually all of the data to be used in the studies to be done by their experts. The data base included the names of the several hundred M.D.s employed as faculty by AECOM during the relevant period, their salaries from year to year (and, therefore, the incremental year-to-year increases in salary of each faculty member), their rank or ranks during the period, the frequency of their publications, their experience, both at AECOM and as reflected in the number of years since they received their M.D.s, and various other information.

From this agreed-upon data base, the parties proceeded in different directions. From their studies, “plaintiffs' experts determined to their satisfaction that the salary differences disfavoring women were statistically significant (at the 0.05, or two standard deviation, level) for the year 1970 and the years 1973 through 1978.” Id. at 1175.

Yeshiva’s experts attacked both the adequacy of a multiple regression analysis generally in this sort of employment context, and the particular study done by plaintiffs’ experts. Their conclusion, based on their own study of the data, was that there was no evidence of salary discrimination, either in the initial setting of salaries or in the annual wage increases. Further, they argued that, when the proper variables were included (and the improper ones excluded), even a multiple regression did not show a statistically significant salary disparity based on sex.

The district court largely accepted the conclusions of Yeshiva’s experts. It found plaintiffs’ regression analysis to be riddled with “major shortcomings”. Among these was the failure adequately to deal with the relative disparity in salaries between faculty members in the clinical departments, which tend to be both disproportionately male and higher paid, and in the “pre-clinical” departments, which were heavily female and relatively lowly paid. In addition, the district court faulted plaintiffs for failing to distinguish between clinicians who were primarily researchers and those whose primary activity was the practice of medicine, and for failing adequately to account for “inherent departmental stratification”. Most important, the court concluded that plaintiffs’ variables that attempted to act as proxies for the inherently amorphous impact of “productivity” on salary “failed to adequately account for the true productivity differences and that the consequential underadjustment for these differences resulted in an overestimate of the sex coefficients.” Id. at 1179.

In order to introduce a variable that would at least approximate productivity, Yeshiva proposed, and the district court accepted, using the rank of the faculty member. Sobel strenuously objected to including rank as a variable, arguing that it was not an independent variable, because an individual’s rank resulted from the same factors that determined salary increases. However, the district court believed rank served to capture intangible factors that did not affect the annual salary increase a faculty member received. Id. at 1180.

The plaintiffs also objected to using rank on a more fundamental ground. For the very reason the district court felt rank should be used — that it reflected intangible productivity factors — plaintiffs argued that there-'was" aTserious risk that impermissi-bler factors would have entered into promotion decisions, and thus that rank, insofar as it determined salary, might reflect sex discrimination. Because of this risk, plamtiffs urged, “academic rank should have peen included as an explanatory variable only where there was clear evidence that neutral and objective standards had consistently been followed and there was no chance that the decisions regarding rank had been affected by sexual discrimination.” Id. The district judge concluded, however, “that promotions in rank * * * were in fact based on merit and were not contaminated by elements of sexual discrimination.” Id. (footnote omitted).

Based on these and other difficulties it had with plaintiffs’ multiple regressions, the district court held that plaintiffs had failed to make out a case of disparate treatment in faculty salaries. One of the factors leading the court to adopt Yeshiva’s view was that plaintiffs did not factor out the effects of discrimination in salaries that occurred before Title VII was made applicable to universities in 1972 (“pre-act”). Under its view of the law, discrimination originating in the pre-act period was not a valid basis for a current finding of a Title VII violation, and the trial court faulted plaintiffs’ conclusion because their study “was not designed to analyze the extent to which salary differentials may have been the consequence of discriminatory acts that occurred” before 1972. Id. at 1182.

This view also led the district court to reject what it characterized as plaintiffs’ “disparate impact” claim, which consisted of the contention that the women faculty members hired before 1972, who allegedly received discriminatorily low starting salaries, never “caught up” to their male counterparts. In the district court’s view, this constituted a disparate impact claim because the mechanism for determining salary increases, the “guideline” system, which provided men and women with roughly equal raises in percentage terms, amounted to a facially neutral system that had the effect of keeping women’s salaries on the AECOM faculty perpetually lower than men’s. Such a facially neutral system that is alleged to have an effect that disproportionately harms a particular group is the essence of a “disparate impact” claim. See generally Dothard v. Rawlinson, 433 U.S. 321, 329, 97 S.Ct. 2720, 2727, 53 L.Ed.2d 786 (1977).

As a general rule, faculty members received the “guideline” increase, which in some years was a simple percentage and in others was a percentage combined with a minimum raise in absolute terms. Some faculty members in some years received what were referred to at trial as “out-of-guideline” increases, normally given when a faculty member was able to convince his department head that his salary was inequitably low and that he should take up the professor’s cause before the administration.

The district court rejected the disparate impact claim on two grounds. First, it found the argument procedurally barred, because plaintiffs had not raised it until late in the trial, after seven years of discovery; it thus seemed unfair to Yeshiva to face an entirely new claim so late in the game.

Second, the court found that any disparities in salaries perpetuated by the guideline system resulted from pre-1972 hires and pre-1972 salaries, which the court said “did not have to meet the standards established under Title VII.” 566 F.Supp. at 1188.

While plaintiffs’ appeal from this judgment was pending, the Supreme Court decided Bazemore, in which the Court held that a pre-act salary disparity that is carried over into the period following application of the act constitutes a violation of the act. As Justice Brennan wrote for the Court,

A pattern or practice that would have constituted a violation of Title VII, but for the fact that the statute had not yet become effective, became a violation upon Title VII’s effective date, and to the extent an employer continued to engage in that act or practice, he is liable under that statute.

106 S.Ct. at 3006. Bazemore thus represented an important and dramatic shift in Title VII law. Most of the lower federal courts had interpreted the Court’s decision in United Air Lines, Inc. v. Evans, 431 U.S. 553, 97 S.Ct. 1885, 52 L.Ed.2d 571 (1977), as holding that employers did not have to equalize salaries that were discriminatory if the disparity originated prior to the application of Title VII. See, e.g., Ste. Marie v. Eastern R. Ass’n, 650 F.2d 395, 404 n. 11 (2d Cir.1981); Farris v. Board of Educ., 576 F.2d 765, 769 (8th Cir.1978). Since Bazemore, courts have recognized as a valid claim for relief the allegation that an employer has failed to remedy the continuing effects of pre-act salary discrimination. See Trout v. Lehman, 652 F.Supp. 144, 146 (D.D.C.1986); cf. Rodriguez v. Chandler, 641 F.Supp. 1292, 1298 n. 18 (S.D.N.Y.1986).

The Bazemore Court also addressed a key evidentiary issue: the weight to be accorded a multiple regression analysis which purports to show discrimination, but which the defendant argues fails to account for certain relevant variables. The Court held that the “failure to include variables will affect the analysis’ probativeness, not its admissibility.” Id. 106 S.Ct. at 3009 (footnote omitted).

In the new light cast by Bazemore, we remanded this case to the district court to afford it an opportunity to determine in the first instance what effect it might have. Sobel II, 797 F.2d 1478. On remand, however, the district court made no attempt to assess the full impact of Bazemore on plaintiffs’ claims and evidence. Instead, the court adhered to its earlier judgment on the technical basis that even if Baze-more altered the law on the substantive issues, its ruling that plaintiffs had failed to raise their “disparate impact” theory until it was unfair to Yeshiva, provided a sufficient independent basis to uphold the judgment dismissing the complaint. Sobel III, 656 F.Supp. at 590-91. This second appeal followed.

DISCUSSION

I. The “Procedural Bar” to Plaintiffs’ Disparate Impact Claim.

We first address the district court’s holding that plaintiffs were precluded from raising a disparate impact claim against Yeshiva because of their failure to raise the claim “until the last minute”, Sobel III, 656 F.Supp. at 590, that is, “midway through the trial.” Sobel I, 566 F.Supp. at 1186. While we recognize that the district court is vastly more familiar than are we with the details of the seven years of pretrial proceedings in this case, we are, frankly, somewhat perplexed by this finding of a “procedural bar”. In our view, the essence of plaintiffs’ case was well known to Yeshiva throughout — and in fact long before — the trial, and it consisted of precisely the claim the district court found not to have been raised until midway through trial. Moreover, the distinction between “disparate treatment” and “disparate impact” cases drawn by the district court artificially and unrealistically pigeon-holed plaintiffs’ claim. Finally, even were we to agree that there was some distinct disparate impact claim that was not raised by plaintiffs until midway through trial, the district court on remand should not have precluded the claim because of the change in the law wrought by Bazemore. We address these points in turn.

A. The Timeliness of Plaintiffs’ Claim.

While it is true that a litigant may be barred from raising a claim if it would work an unfair surprise on her adversary, see Ste. Marie, 650 F.2d at 399 n. 2 (attempt on appeal to argue that evidence supported disparate impact theory not argued at trial rejected as “belated”); Presseisen v. Swarthmore College, 442 F.Supp. 593, 603 (E.D.Pa.1977), aff'd without op., 582 F.2d 1275 (3d Cir.1978); cf. Rossini v. Ogilvy & Mather, Inc., 798 F.2d 590, 604-05 (2d Cir.1986) (where plaintiff raised only disparate treatment argument at trial, “it would have been unfair to evaluate the evidence under the disparate impact theory after trial”), the critical question is whether the claim was in fact not raised until it was too late. We need not consider here whether the point indicated by the district court when plaintiffs allegedly first raised the disparate impact claim (“midway through the trial”) would have been too late, since the record contradicts the district court’s view and shows the claim was raised well in advance of trial.

The district court, and to some degree the parties, became caught up in the labels applied to claims, as opposed to the actual nature of them. Sobel argues on appeal that the single mention of the magic phrase “disparate impact” in the EEOC’s pre-trial memorandum was sufficient to raise the claim. This argument misses the point; what is important is whether the defendant was reasonably aware of the claim, not whether plaintiffs at some time in the pre-trial period happened to use the right phrase. See Mir v. Fosburg, 646 F.2d 342, 347 (9th Cir.1980) (question is whether the time at which defendants were made aware of the thrust of plaintiff’s case “will unfairly prejudice the defendants in their defense”); see also 5 C. Wright & A. Miller, Federal Practice and Procedure, § 1219 at 145-46 (1969 & 1987 Supp.).

Having examined the trial transcript and reviewed the pre-trial documents in the record on appeal, we are convinced that Yeshiva was fully cognizant that plaintiffs’ claim consisted of the central allegation that the system of guideline salary increases left the plaintiff class perpetually behind their male counterparts on the AECOM faculty. The trial brief jointly submitted by plaintiffs and the EEOC contained the following argument:

While it may not have been unlawful for AECOM to discriminate against women prior to 1972, such conduct did become unlawful after March 24, 1972. AECOM officials had broad discretion to set and adjust salaries, including discretion to correct salary inequities and * * * [t]his discretion provided a ready mechanism for remedying salary in equities based on sex. Accordingly, each time AECOM paid a faculty member after 1972, by regularly issued pay checks * * * it reasserted its prior discriminatory salary determinations.
******
Assume an employer in 1963 hired white vice-presidents at a salary of $10,-000 per year and black vice-presidents at $8,000 per year. * * * In 1965, when private employers became subject to the provisions of Title VII, this hypothetical employer would have had a clear obligation under law to equalize the salaries of the white and black vice-presidents. If the employer merely awarded * * * raises on a nondiscriminatory basis, the prior racial classifications would not only continue, but indeed be created anew with each paycheck.

Plaintiffs’ Trial Br. at 14-15. This argument was not lost on Yeshiva, for it went to some lengths to respond. For example, Point III of Yeshiva’s trial brief was addressed to what Yeshiva saw as a fatal flaw in plaintiffs’ regression: that it failed to separate out the effects of pre-act salary disparities, which were not illegal and, in Yeshiva’s view, could not support a finding of a present violation. As Yeshiva argued,

Plaintiffs in this case stand in no better position than the unsuccessful plaintiff in Farris v. Board of Education, 576 F.2d 765 (8th Cir.1978). In that case, the Court dismissed plaintiff’s claim that her salary was comparatively low because annual raises were based on each previous year’s salary and her prior salary had been reduced when she took an unpaid maternity leave. While that past act might have constituted discrimination had it not occurred before the passage of Title VII, the defendants’ current policy of granting raises based on each previous year’s salary was a neutral one and, therefore, her claim was defeated.

Defendant’s Trial Brief at 49. Yeshiva went on to argue, “Plaintiffs [sic] yearly salary regressions of faculty hired both before and after the statute of limitations are only probative if AECOM was legally obligated to remedy time-barred discrimination * * Id. at 50. After Bazemore, however, the point is not whether Yeshiva had to remedy time-barred discrimination (which would be pre-1974), but rather whether it had to remedy discrimination in the limitations period but which originated prior to the application of Title VII to universities in 1972. Nevertheless, it is clear that in its trial brief Yeshiva regarded at least one of the plaintiffs' claims as resting on an obligation to equalize salaries once Title VII applied to universities — the very claim the district court found to have been suddenly sprung on Yeshiva midway through the trial.

If these excerpts did not establish Yeshiva’s awareness of this aspect of plaintiffs’ case, the summation of its position in the pretrial brief surely did: “[T]he focus is not on the ‘bottom line’, which in this case would be the alleged salary disparities, but rather on the present employment practices and policies, such as the setting of starting salaries and annual increases, during the relevant time period.” Id. at 51. We conclude that defense counsel was here responding to the claim he perceived plaintiffs to be raising at that time — prior to trial.

Our reading of the trial transcript bolsters our conclusion that plaintiffs were from the outset raising the claim that the guideline system operated to maintain women’s salaries at a discriminatorily low level that had been set, for the most part, prior to 1972. Early in the trial, during the testimony of Dr. Ephraim Friedman, only the third witness, the district judge made the following statement of Sobel’s case:

The COURT: The position you are arguing for is that since they give salary increases yearly, albeit on a sort of guideline, across-the-board basis, that as of the first year the act becomes effective they were required to correct all pre-act inequities and to immediately bring all females to the salary level that they would have been at if they had not been discriminated against, perhaps for many years before the act * * *

Trial Tr. at 383. At least at that early stage of the trial, if not before, the district judge understood plaintiffs to be raising the argument that guideline increases, given presumably equally to both men and women, would not meet Yeshiva’s obligation to eradicate continuing effects of pre-1972 discrimination. Indeed, he told plaintiffs that “you may correctly assume that I have not bought your argument in your trial memo that Evans notwithstanding, failure each year to correct pre-act discrimination constitutes an offense under the act.” Id. at 382 (emphasis added). To be sure, the court did not accept the argument; nevertheless, plaintiffs did raise it.

In fact, this understanding of plaintiffs’ central argument pervaded the trial. The trial court repeatedly expressed skepticism over the validity of plaintiffs’ regressions because they did not separate out the effect of salaries that, even if discriminatory, were established before 1972 and were not, in the district court's view of pre-Bazemore law, required to be increased. For example, the court engaged in this exchange with one of plaintiffs’ expert witnesses, Dr. Donald Wise, and plaintiffs’ counsel, Michael Buchwach:

The Court: If * * * you are attempting to establish whether there are gender based discriminations only in years after a certain date and if some of the people were hired before that date so that you might commence with a gender based discrimination that carries over, does a pure linear regression establish that?
The Witness: I think it could. * * *
The Court: [F]or those who were hired before the effective date of Title 7, would not a better approach for determining discrimination in the years that the law was in effect have been not to use total salary, but to use year-by-year changes in increment as the basis of your comparison?
* * * * * *
[T]he plaintiffs’ approach is that after the act became effective, men and women should have been given the same raises, other things being equal, on a gross basis, in other words, you shouldn’t take into account the fact that somebody has been discriminated against before and use that as a basis for giving a smaller gross raise even if it is percentage-wise the same.
Mr. Buchwach: Your Honor, I don’t think you are accurately stating the plaintiffs’ theory.
The Court: What is your theory?
Mr. Buchwach: Plaintiffs’ theory is that people with the same background, experience and the other variables taken into account should be paid the same sums of money.
The Court: You go even further. You say that the first year the act went into effect and a 10 percent guideline raise was given to the man, taking him to $22,000, the woman should have been given a $12,000 raise from her $10,000 salary and paid $22,000 that year. I know you take that position.
Mr. Buchwach: Yes, your Honor.
The Court: I don’t think there is any possible support for it, but you can make an intermediate argument that even under Evans, while they weren’t compelled to bring her salary up to what it would otherwise have been, it would be inappropriate to use percentage raises that keep perpetuating the past discrepancy and that she was entitled to a $2,000 raise the first year and not a $1,000 raise the first year.

Trial Tr. at 712-16 (emphasis added). It is evident from this exchange that the court was well aware that plaintiffs’ central premise was that unless women’s salaries were immediately brought up to those of the men on the faculty, the guideline increases women would thereafter receive would never allow them to achieve salary parity. See also Trial Tr. at 979-81 (district court indicating that the “interesting comparison” was between pre-act salaries and post-act salaries, and perhaps that “to the extent there has been sex discrimination there has been an effort to eliminate it over the years that has at least been effective with the more recently employed persons”); at 1249 (district court questions witness based on premise that “since [women] had previously been discriminated against and since they were being given guideline raises thereafter, their salaries tended to keep dragging behind” those of men); at 1278 (district court hypothetically accepts “defendant’s position that [the law] does not require immediate changes in [the salary of] every woman who had been discriminated against pre-Act”, and characterizing as a “fail-back position” plaintiffs’ argument that even if immediate equalization is not required, constant percentage increases merely perpetuate past discrimination).

What appears to have occurred at trial is that the court early in the trial rejected plaintiffs’ legal argument that Yeshiva had a legal obligation to equalize women’s salaries immediately upon application of Title VII to universities, and concluded that so long as women fell no further behind in the post-act period, that was sufficient compliance on the part of Yeshiva. His concern appears to have been that if a male faculty member making $20,000 and a female faculty member making $10,000 both received a 10% raise, in real terms, the male would receive a $2,000 increase, while the woman would receive only $1,000 more in salary, thus not only perpetuating the pre-act disparities, but at least in absolute terms, widening them. Bazemore, of course, reveals that the trial judge’s initial determination was incorrect; the point here, however, is simply that in reaching that conclusion and focusing as he did on the more narrow question of percentage-v.-absolute salary increases, the district court was in fact dealing with the very issue that it later held had not even been raised until midway through trial.

Our impression is further supported by the district court’s evidentiary rulings, which often turned on its perception of the limited relevance of evidence of pre-act salary discrimination. For example, plaintiffs sought to introduce a letter written by Dr. Sam Seifter, while he was chairperson of the biochemistry department, to Dr. Friedman, then dean of AECOM, in 1975, requesting an out-of-guideline increase for a female member of his department, Dr. Blu-menfeld. As characterized by defendant’s counsel, the letter “reflect[ed] what some chairman [said] * * * Dr. Blumenfeld has been paid lower salaries because of some historical trends.” Trial Tr. at 365. Plaintiffs wanted the letter introduced to show the dean’s awareness of complaints of salary inequities due to gender; as counsel argued, “I think that’s exactly what Dr. Seifter was complaining about because it says in * * * Dr. Seifter’s letter: Her relatively low salary is in my judgment related to lower salaries accorded to women and which have been perpetuated despite regular increments.” Id. at 369 (emphasis added).

The district court reserved ruling on the Seifter letter at that point, but later stated (based on the fact that Dr. Blumenfeld apparently received the requested out-of-guideline increase),

I would point out to you that depending on the view one takes of evidence in other cases, it is conceivable that this is evidence in support of the defense, namely, that there was historical discrimination against women, that it ceased on the day Title 7 became effective to Yeshiva, and that apparent statistical discrepancies are carryovers which under Evans you are not required to remedy.

Trial Tr. at 633 (emphasis added). At other points in the trial, the district court indicated that inequities that may have existed prior to 1974, the pre-limitations period, could be shown only to demonstrate “background”, and that such evidence was not directly relevant as to the remedial period. As we discuss more fully below, these rulings were erroneous in light of Bazemore, but, erroneous or not, they demonstrate that the thinking of the district court plainly was focused on the idea of isolating pre-act discrimination. That the district court was aware of the nature of Sobel’s “disparate impact” theory is further made plain by its statement to plaintiffs’ expert, Dr. Ashenfelter:

[O]ne [possible conclusion] is that a substantial part of what you detect — what you have determined to be sex discrimination is directed at those women employed before March 24,1972, and to that extent it brings into focus some of the different legal theories we have discussed concerning carryover effects, corrections of them, and what-have-you * * *

Trial Tr. at 980-81. Thus, it is evident that plaintiffs’ “disparate impact” claim, in substance, was before the district court before trial, and was a major focus of the court’s thinking during trial.

B. The Legal Characterization of Sobel’s Claim.

The district court’s finding that Sobel’s disparate impact claim was not timely raised depended, of course, on its ruling that what it perceived her to be arguing was, in fact, a disparate impact claim. We disagree. This was a mischaracterization of the nature of Sobel’s Bazemore-style allegation and erroneously focused on salary increases under the guideline system rather than on the initial failure, when Title VII was first applied to universities, to raise women’s salaries to the same levels enjoyed by comparably situated men.

As we have already discussed, Sobel’s claim that Yeshiva’s guideline system of salary increases perpetuated (and, to some degree, exacerbated) pre-act salary discrimination was before the district court prior to and throughout the trial. This claim is on all fours with the claim recognized by the Supreme Court in Bazemore. See 106 S.Ct. at 3004-06. Bazemore involved an allegation of discrimination on the basis of race in salaries in the North Carolina Agricultural Extension Service. The plaintiffs alleged that initial salaries paid to black employees were lower than those paid to whites, and that such discrimination was perpetuated by the defendant’s system of increasing salaries, which in some ways was remarkably similar to Yeshiva’s guideline system. See Bazemore v. Friday, 751 F.2d 662, 668, 671 (4th Cir.1984) (portion of salary increases based on across-the-board and percentage increases given nondiscri-minatorily; remainder based on subjective merit a potential source of discrimination). The plaintiffs argued “that the pre-Act discriminatory difference in salaries should have been affirmatively eliminated but has not.” Id. at 670. That was precisely plaintiffs’ claim here.

Such a claim is not properly characterized solely as one of disparate impact. A classic disparate impact analysis examines a facially-neutral policy or practice, such as a hiring test or a height requirement, for its differential impact on the hiring or salary of a particular group. See, e.g., Connecticut v. Teal, 457 U.S. 440, 446, 102 S.Ct. 2525, 2530, 73 L.Ed.2d 130 (1982); Dothard, 433 U.S. at 329, 97 S.Ct. at 2726; Clady v. County of Los Angeles, 770 F.2d 1421, 1427 (9th Cir.1985), cert. denied, 475 U.S. 1109, 106 S.Ct. 1516, 89 L.Ed.2d 915 (1986). Never directly questioned in a disparate impact case — probably because it is not an essential element of a plaintiffs ;prima facie case — is why the practice in question disproportionately affects the group. If the issue arises, it does so in the context of the employer’s burden, in light of the plaintiffs prima facie showing, to show that “any given requirement [having a disparate impact] must have a manifest relationship to the employment in question.” Griggs v. Duke Power Co., 401 U.S. 424, 432, 91 S.Ct. 849, 854, 28 L.Ed.2d 158 (1971). If, for example, a test of physical skills such as speed, strength, and stamina disproportionately disqualifies women from employment as firefighters, the employer would attempt to demonstrate that the reason women were failing the test more often than men was because they lack those attributes, and then that such qualities were job-related. See Berkman v. City of New York, 812 F.2d 52, 59-60 (2d Cir.), cert. denied, — U.S. —, 108 S.Ct. 146, 98 L.Ed.2d 102 (1987).

In this case, the reason why the facially neutral guideline system had a “disproportionate” impact on women — indeed, the only reason such a system of across-the-board salary increases could ever have a disparate impact — was because of pre-act discrimination, either in setting initial salaries or, at some point, in increasing them discriminatorily. The disparity in impact of the facially neutral guideline policy resulted from earlier disparate treatment, both pre-act and pre-statute of limitations. The neutral mechanism, far from being the discriminatory act, is merely the means by which the pre-act and pre-limitations disparate treatment is carried forward into the actionable time frame.

Here, the adherence by Yeshiva to the guideline system was not the violation claimed by Sobel; she, like the Bazemore plaintiffs, was challenging the failure to equalize salaries, separate and apart from the operation of the guideline system. See Sobel I, 566 F.Supp. at 1186 n. 54 (“[I]f [plaintiffs] were simply granted a one-time adjustment to make the average of their salaries equal to that of their male counterparts, they would be perfectly content to let the guideline system continue operating just as it has for years.”). Of course, the two are in practice two sides of the same coin; adherence to the guideline system effectively precluded giving the one-time increases that would have equalized women’s salaries.

While it is true that in a disparate treatment case a plaintiff ordinarily must show discriminatory motive, a showing unnecessary in a disparate impact case, see Dothard, 433 U.S. at 329, 97 S.Ct. at 2726; Williams v. Colorado Springs, Colo. Sch. Dist., 641 F.2d 835, 839 (10th Cir.1981), this distinction is not relevant to a Bazemore claim. While Sobel probably could not show that Yeshiva’s adherence to the guideline system was done with discriminatory motive, that adherence was, as we have noted, only the manner in which the disparities were perpetuated; the violation was Yeshiva’s failure to remedy the disparities. The failure to bring women’s salaries up to par with those of men the day Title VII applied to Yeshiva is the sort of pattern and practice that would sustain a disparate treatment claim, even absent explicit proof of discriminatory motive. See International B’Hood of Teamsters v. United States, 431 U.S. 324, 335 n. 15, 97 S.Ct. 1843, 1854 n. 15, 52 L.Ed.2d 396 (1977).

C. The Treatment of Plaintiffs’ Claim After Bazemore.

Even if we were to agree that the district court was initially correct in treating as procedurally barred plaintiffs’ claim of perpetuation of salary disparities, we would nevertheless conclude that in light of Baze-more it was error to continue to do so on our remand to the district court.

It is beyond question that a pending case must be decided under the law in effect at the time it is decided, as opposed to that governing when the case was tried, if the law changes in the interim. Thorpe v. Housing Authority, 393 U.S. 268, 281-82, 89 S.Ct. 518, 525-26, 21 L.Ed.2d 474 (1969) (“The general rule * * * is that an appellate court must apply the law in effect at the time it renders its decision. * * * This same reasoning has been applied where the change was constitutional, statutory, or judicial.” (footnotes omitted)); Spirt v. Teachers Ins. and Annuity Ass’n, 735 F.2d 23 (2d Cir.), cert. denied, 469 U.S. 881, 105 S.Ct. 247, 83 L.Ed.2d 185 (1984); National Auto Brokers v. General Motors Corp., 572 F.2d 953, 960 & n. 11 (2d Cir.1978), cert. denied, 439 U.S. 1072, 99 S.Ct. 844, 59 L.Ed.2d 38 (1979).

In light of Bazemore, various of the district court’s rulings became erroneous. The most obvious error (with hindsight) was that the court discounted the weight to be accorded plaintiffs’ regression studies in part because plaintiffs had not eliminated the effects of pre-act salary discrimination. While this was not the only basis on which the trial judge rejected plaintiffs’ experts’ conclusions, it is clear from his opinion, as well as from as his intense questioning of plaintiffs’ experts on this very issue, that this was a critical element in his mind.

Perhaps as important, the district court's concentration on the discrimination that directly could be traced to the post-act period prevented it from focusing on that portion of the data that was the strongest part of plaintiffs’ proof — the salary disparities among pre-1972 hires. Indeed, the court discounted the strength of plaintiffs’ data for the entire class because it included pre-1972 hires, whose lower salaries the court felt resulted from pre-1972, and therefore not actionable, salary decisions.

Moreover, the court’s treatment of much of plaintiffs’ anecdotal evidence of discrimination by Yeshiva was affected by its secondary focus on acts occurring before 1972. The court criticized plaintiffs for the paucity of anecdotal evidence, 566 F.Supp. at 1184-86, but most of what they tried to introduce related to pre-act events, and the district court either refused to admit it at all, Trial Tr. at 382 (were court to accept plaintiffs’ “continuing effects” theory, “there would be a good deal more admissibility” to rejected letter written by department chairperson in 1976 indicating past gender inequities had not been resolved), or allowed it in only as “background”, Trial Tr. at 251-59 (accepting only for limited purposes document relating to recommendations for out-of-guideline increases, allegedly to correct sex disparities, made prior to limitations period). In Sobel I, the court said that anecdotal evidence dating from before December 1974, the limitations date, was “entitled to little weight and constitute[d] only relevant background evidence.” 566 F.Supp. at 1185. So limiting the presentation of plaintiffs’ case might well have been appropriate before the Supreme Court decided Bazemore, see Bazemore, 751 F.2d at 672 (“pre-Act discrimination” admissible “to show the general background of the case, or intent, or to support an inference that such discrimination continued”), but the Supreme Court’s reversal of the fourth circuit in Bazemore makes such evidence directly probative as supporting the claim that the disparities evidenced in the post-1972 salaries of pre-1972 hires were due to discrimination against them.

In short, despite the length of the trial, of the record, of the exhibit list, and of the district court’s opinion, plaintiffs have not received a full and fair opportunity to have their case heard in light of the new learning contributed by Bazemore. The unusual nature of the Bazemore claim makes it appropriate to allow plaintiffs even now an opportunity to construct a “continuing effects” claim, even if Sobel had not attempted to do so in the first trial. Since she did in fact try to make out such a case, and since defendant was well aware of what it was that plaintiffs were seeking to do, there is no imaginable unfairness to Yeshiva in allowing plaintiffs to establish what they tried to establish at the first trial, and might well have been able to establish but for the rulings of the district court made erroneous by Bazemore.

It would not have been surprising if plaintiffs had failed to focus on the theory that pre-1972 disparities in salary were carried over into the post-act (and, later, into the limitations) period, since the district court, and the weight of authority, indicated that the continuing effects theory was closed to plaintiffs by United Air Lines, Inc. v. Evans, 431 U.S. 553, 97 S.Ct. 1885. Perhaps these plaintiffs may be considered fortunate that Bazemore came down just in time for it to be applied to their case, but good fortune is sometimes a litigant’s best ally.

We expected with our earlier remand in this case that the district court would reopen the case for the admission of additional evidence, including some that originally was excluded, and for reevaluation of the entire record as supplemented in order to consider the related questions of whether there was pre-act discrimination and whether it was, in fact, carried over into the actionable period. Bazemore gave plaintiffs the right to such an inquiry, a right denied to them by the district court’s cursory treatment of the case on remand.

II. Alternative Bases for the District Court’s Decision.

The district court appears to have offered at least two other grounds for dismissal of plaintiffs’ complaint. First, it distinguished Bazemore on the ground that it involved discrimination against blacks, while here the plaintiff class is composed of women. Second, in its initial decision, the court held that plaintiffs had not established their “disparate impact” claim. Neither factor is sufficient to avoid a full reevaluation of the case under Bazemore.

A. The Applicability of Bazemore to Gender Discrimination.

In its opinion on remand, the district court drew a distinction between race and gender discrimination, on the ground that racial discrimination was illegal even before Title VII, while gender discrimination claims are “solely a product of Title VII, and claims in that regard did not exist for pre-Act periods.” Sobel III, 656 F.Supp. at 589 n. 5. He implied from this that employers ought not be held liable for the consequences of pre-act discrimination not illegal when made.

We need not tarry long on this erroneous conclusion, since it rests on a misunderstanding of Bazemore. The Supreme Court did not allow an employer to be held liable for its pre-act decisions; indeed, Bazemore specifically stated that there is no back-pay liability for the discriminatory paychecks received by blacks before Title VII became effective to the North Carolina Agricultural Extension Service. Bazemore, 106 S.Ct. at 3006.

What the employer is liable for is continuing its pre-act discrimination into the post-act period. It violates Title VII to pay lesser salaries to protected employees during the time the statute applies to the employer, and in determining whether such a violation occurred, the prior character of the pre-act discrimination — whether it was illegal apart from Title VII — is irrelevant. Bazemore held that the treatment of blacks after Title VII was applied to their employer can constitute a Title VII violation, even if the conduct began before the effective date. There is no reason that logic should not apply with equal force to gender discrimination. Cf. City of Los Angeles v. Manhart, 435 U.S. 702, 709, 98 S.Ct. 1370, 1375, 55 L.Ed.2d 657 (1978) (equating claims of racial and gender discrimination).

B. The Viability of Plaintiffs’ “Continuing Effects” Claim.

In Sobel I, the district court offered alternative reasons for rejecting plaintiffs’ “disparate impact” theory. We have already addressed the first, the alleged procedural bar arising from the timing of plaintiffs’ assertion of the claim. The second was that plaintiffs had been unsuccessful in proving the claim. As the district judge himself intimated in Sobel III, this conclusion cannot stand in the face of Bazemore.

It appears to us that even as the record stands now, without the supplementation we had expected would occur on the remand, there is considerable evidence to support a Bazemore-type violation. Table 3 of the district court’s first opinion, 566 F.Supp. at 1177, demonstrates that when only plaintiffs’ variables are used in a regression analysis, and using the data agreed upon by the parties, the analysis produces a statistically significant sex coefficient among pre-1972 hires for every year from 1974 to 1979, whether the computation is done on salary or a logarithm of salary, except 1976, where the coefficient for salary is 1.98, just under the 2.0 level of statistical significance. Table 4 shows that even when some of defendants' suggested variables are included, the statistical significance is maintained for 1977-1979 when plaintiffs’ data is used, and for 1976 and 1977 when the agreed-upon data is used.

The district court plainly found this data important in its decision, since it discussed at length as a weakness in plaintiffs’ case the fact that they did not “analyze the extent to which salary differentials may have been the consequence of discriminatory acts that occurred prior to the date that Title VII became applicable to universities”, id. at 1182, and made much of the statement by plaintiffs’ expert, Dr. Ashen-felter, that, when that analysis was later performed, it “revealed no statistically significant proof of sexual discrimination during the relevant time period.” Id.

We are frankly skeptical of defendant’s claim now that there was no discrimination even before Title VII was applied to Yeshiva. The district court discounted plaintiffs’ showing of salary disparities in the actionable period because it felt whatever disparities had been shown were due to pre-act events. If that was true when Yeshiva stood to benefit because Evans and its progeny appeared to insulate it from liability for those pre-act events, it is no less true now that that view of Evans has been discredited by Bazemore. Fact-finding is not a function of which side will benefit depending upon how the relevant law may treat a given fact.

While it is true that Yeshiva has always maintained that there is no evidence even of pre-act discrimination, and has relied only as a secondary position on the idea that if there was discrimination it must have been pre-1972, the district court appears to have accepted that secondary position as at least part of the true explanation for the current (i.e., 1974-1979) disparities. The district court’s comments during trial reflect this, see, e.g., Trial Tr. at 980-81 (“From [the increases in women’s relative pay] two possible conclusions can be drawn, and one is that a substantial part of what you detect — what you have determined to be sex discrimination is directed at those women employed before March 24, 1972 * * *.”), and its opinion confirms it. See Sobel I, 566 F.Supp. at 1188 (“The original salaries of those hired before March 24, 1972, did not have to meet the standards established under Title VII. * * * Hence, the plaintiffs are placed in the awkward position of arguing that past initial salaries which were allegedly discriminatory but certainly legal, and present annual increases that have been shown to be fair and non-discriminatory somehow add up to a ‘present violation’ of Title VIL”). While the district court was appropriately cautious about making an unnecessary (given its view of the pre-Bazemore law) determination of whether plaintiffs’ allegations of pre-act salary discrimination had been proven, such a determination is now rendered necessary by Bazemore.

Our conclusion that plaintiffs’ allegation of pre-act salary discrimination being carried over into post-act salary disparities must be considered anew is buttressed by the district court’s opinion on remand. The district court stated, “Had we relied solely on Evans in dismissing the plaintiffs’ disparate impact claim on the merits, our prior ruling might require reconsideration.” Sobel III, 656 F.Supp. at 590. Since we have already rejected the alternative ground (the procedural bar) upon which the district court’s dismissal rested, we agree that the prior ruling “require[s] reconsideration.” This is true even though the trial judge indicated in his opinion that “the evidence showed that [lower salaries initially paid to some female faculty members hired prior to 1972] probably resulted not from discrimination, but from several gender-neutral factors.” Id. at 589.

Such so-called “gender-neutral factors” pointed to by the district court are, however, little more than inferences, built on speculation and stereotypes, unsupported by the record. In a footnote from Sobel I —repeated in Sobel III — the district court found that much of the pre-act disparity was due to the “accepted sociological fact that [in the 1960s] the percentage of men who were the sole wage earners for families with children exceeded the percentage of married women who were such.” Sobel I, 566 F.Supp. at 1184 n. 49. According to the court, it “was considered not only socially acceptable but also socially desirable” to favor sole heads of households. Sobel III, 656 F.Supp. at 590 n. 8.

The problems with this approach to fact-finding are many. First, Yeshiva introduced no evidence — none—either that such a policy was in effect at AECOM, even informally, or that the “accepted sociological fact” that men were more often sole wage earners was true on the AECOM faculty. There was some testimony that, with AECOM in some financial difficulty in the late 1960s, salary increases tended to go to those who complained loudest and longest. Trial Tr. at 1251-52. It was the court, however, that inferred from that that sole wage earners were more likely to complain and therefore to get the raises, and then inferred further that this was likely to favor men because the “accepted sociological fact” was true at AECOM, and finally that this difference explained the pre-1972 salary disparities. We find this chain of inferences too weak to support a valid conclusion.

Equally important, there is no evidence in the record indicating how such a “policy” — if it existed — was implemented, and whether it was “gender-neutral”. For example, assuming that some women on the AECOM faculty were sole wage earners, did Yeshiva pay them more than their female colleagues not so situated? Were there demonstrable differences between male faculty members based on their status as the head of a household? If such a “gender-neutral” policy existed, and if it had a disproportionate effect on women, Yeshiva presumably would have to show that this policy was job-related, a showing we doubt it could make. In short, the supposedly “gender-neutral” factor found by the district court rested on a stereotyped notion of male and female characteristics (in this instance, “sociological” characteristics) the application of which to this case simply had no basis in the record, and which almost certainly violated Title VII if it continued to affect women after Title VII was applied to Yeshiva. Cf. Manhart, 435 U.S. at 707, 98 S.Ct. at 1375 (“It is now well recognized that employment decisions cannot be predicated on mere ‘stereotyped’ impressions about the characteristics of males or females. Myths and purely habitual assumptions * * * are no longer acceptable reasons for refusing to employ qualified individuals, or for paying them less.”).

In short, the district court’s explanations for disregarding pre-act salary disparities simply are not supported by the evidence, which to this point is incomplete on the crucial question of pre-1972 salaries at Yeshiva. We need not speculate, as the district court did, see Sobel III, 656 F.Supp. at 590-91 n. 10 (“In view of the evidence presented, we question whether such a case could ever have been successful.”), as to the likelihood of plaintiffs’ succeeding in a new trial. They are entitled to try.

III. Matters For Administration on Remand.

As we have indicated, we are concerned about the long history this case has thus far written. In the hope that we can facilitate the closing of this chapter in the lives of these litigants, of the southern district, and of this court, we offer the following suggestions for the remand.

A. The Weight to be Accorded Plaintiffs’ Regressions.

Part of our initial remand in this case was for the purpose of reconsidering the probative value of plaintiffs’ regression analyses in light of Bazemore. See Sobel II, 797 F.2d at 1479. While we agree with the district court’s determination in Sobel III that it was correct in Sobel I insofar as it treated the flaws in plaintiffs’ regressions as going to the weight rather than their admissibility, see Bazemore, 106 S.Ct. at 3009, the court did not, in Sobel I, apply the correct standard in evaluating defendant’s objections to the regressions.

We read Bazemore to require a defendant challenging the validity of a multiple regression analysis to make a showing that the factors it contends ought to have been included would weaken the showing of a salary disparity made by the analysis. See Bazemore, 106 S.Ct. at 3010-11 n. 14 (“[Defendants’] strategy at trial was to declare simply that many factors go into making up an individual employee’s salary; they made no attempt * * * to demonstrate that when these factors were properly organized and accounted for there was no significant disparity between the salaries of blacks and whites.”); see also Palmer v. Schultz, 815 F.2d 84, 101 (D.D.Cir.1987) (“Implicit in the Bazemore holding is the principle that a mere conjecture or assertion on the defendant’s part that some missing factor would explain the existing disparities * * * generally cannot defeat the inference of discrimination created by plaintiffs’ statistics.”).

Here, one of Yeshiva’s primary attacks on plaintiffs’ proof was “the plaintiffs’ use of inadequate proxies for productivity.” Sobel I, 566 F.Supp. at 1182. Yeshiva contended that Sobel had left out several important variables that would represent productivity and, implicitly, reduce the sex coefficient by explaining some of the disparity in salary which plaintiffs’ experts had attributed to gender discrimination.

With one exception — rank—which will be discussed shortly, Yeshiva did not show that, with these factors accounted for, the apparent gender disparity was reduced. Yeshiva’s experts simply criticized plaintiffs’ failure to include them, offering no reason, in evidence or analysis, for concluding that they correlated with sex and therefore were likely to affect the sex coefficient. Of course, Yeshiva is free on retrial to seek to show that any regression offered by plaintiffs is inadequate for lack of a given variable, but such an attack should be specific and make a showing of relevance for each particular variable it contends plaintiffs ought to include. Ideally, Yeshiva would seek to do so by offering its own regression that includes the variable it contends improperly was omitted. At the first trial, the single regression Yeshiva offered was packed with all of the factors it contended plaintiffs should have included, but it provided the court with no chance to sift through the various factors to determine the weight to be assigned to any of them.

As contained in Yeshiva’s experts’ report, the failure by plaintiffs' adequately to reflect productivity caused an “underad-justment bias” whereby the disparities based on productivity were understated, and the disparities attributed to gender were correspondingly overstated. The report stated:

[I]t is a mathematical fact that if in a multiple regression analysis salary is regressed on proxies that imperfectly reflect qualifications and productivity, and if women generally have lower proxy values than men, then there will be an underadjustment for differences in true productivity and a resulting overstatement of the sex coefficient

Defendant’s Statistical Report, Trial Ex. 745, at 69-70 (emphasis in original). The key to this argument, of course, is that “women generally have lower proxy values than men”. If women and men have equal measures of true productivity, then having imperfect variables for productivity would cause no underadjustment.

Bazemore, as applied here, requires Yeshiva to show that the failure to include a proxy causes an actual underadjustment. Yeshiva’s experts concluded that Sobel’s regressions contained an “underadjustment bias” simply because men scored higher on the included variables. See id. at 74 n.*. Insofar as Yeshiva argued that simply because men “scored” higher the imperfection of the included variables itself proved that the regression underadjusted for productivity, the argument is unpersuasive. Men might have scored even higher had the variables perfectly reflected productivity, and this would have explained even more of the apparent gender disparity. But it is equally possible that the imperfection had the opposite effect; that women would have scored higher if the proxies were more accurate. On the present record, there is no way to tell which gender was disadvantaged by the imperfections.

Put another way, all that is known about the proxies used by plaintiffs in their regression is that they are not perfect measures of productivity, and that insofar as they do measure productivity they show that men on the AECOM faculty possess the attributes tied to productivity (e.g., experience) in greater measure. What is not known is whether variables that exactly measured productivity would show the same advantage for men (and thus would explain the same portion of the raw gender disparity as the imperfect proxies), a lesser advantage for men (and therefore explain less of the gender discrepancy), or a larger advantage. In short, the simple fact of imperfection, without more, does not establish that plaintiffs’ model suffers from un-deradjustment, even though men score higher on the proxies.

If the argument is instead that the excluded variables are likely to favor men simply because the included ones do so, and that therefore failing to include them is what causes underadjustment, Yeshiva still must show that the former ought to have been included on a stronger basis than simply that they favor men; they must still be actual determinants of salary, or at least adequate proxies for productivity. Moreover, they must be shown not to be multicollinear with those variables already included, and for that matter with each other, and also not themselves tainted by sex discrimination. In short, if Yeshiva seeks to show that plaintiffs’ regression analysis suffers from failure appropriately to adjust raw salary disparities for differences in productivity, it must actually demonstrate that failure. It cannot rely on assumptions about imperfections inherent in productivity proxies, nor can it simply propose alternative variables without justifying their inclusion.

As to rank, we conclude that it was appropriately included by the district court. Plaintiffs concede that rank does correlate with sex, and concede further that rank does at least loosely reflect productivity. They argue only that its inclusion may serve to mask sex discrimination because promotions in rank might not have been gender-neutral. However, the district court concluded that “promotions in rank * * * were in fact based on merit and were not contaminated by elements of sexual discrimination”, Sobel I, 566 F.Supp. at 1180, and in light of the fact that plaintiffs abandoned before trial any claim of gender discrimination in promotions, that finding is not clearly erroneous. Thus, on remand, rank should be included as a variable in any regression analysis.

We also reject Yeshiva’s attack on the multiple regression technique as a general matter when applied to the complex and diverse context of a medical school faculty. While it is true that the relative uniqueness of each faculty member, and the subjectivity of many of the determinants of salary, make a regression analysis difficult, these problems are not insurmountable. Indeed, as a device designed to sift through various factors in order to assess as accurately as possible the influence of any one of them, the multiple regression analysis is the accepted means for performing this difficult task. See Bazemore, 106 S.Ct. at 3008-09 (accepting plaintiffs’ case which “relied heavily on multiple regression analyses” and saying that a “plaintiff in a Title VII suit need not prove discrimination with scientific certainty”). Accepting Yeshiva’s contention would have the practical effect of insulating universities from charges of discrimination in the setting of faculty salaries, since such claims may well be virtually unprovable by any other means.

In the place of multiple regressions, Yeshiva sought to introduce the so-called “urn model”, an analytic tool remarkable only for its extremely limited usefulness. In an effort to show that the salary disparity identified by plaintiffs as being due to gender could in fact occur at random, Yeshiva's experts conducted the following procedure: slips containing the salaries of all AECOM faculty members were placed in an “urn", and a number equaling the number of women on the faculty was drawn at random. This, in effect, formed a “control” group that could be compared to the female faculty members, since the random selection was by definition nondiscriminatory. If the average salary of the “control” group, when compared to the salary slips left in the urn, approximated the disparity that existed between women and men, it would tend to show that that disparity could have occurred at random.

The advantage to this approach is its simplicity, since “its use did not depend on the same underlying assumptions upon which the plaintiffs’ model rested, and which were so much in doubt.” Sobel I, 566 F.Supp. at 1183 n. 42. Because of this, the court found that “the urn model provided a very appropriate test given Einstein’s complex organization and the diffuse factors affecting salaries.” Id.

All the urn model tends to show, however, is that a given salary disparity could occur at random. It does not show that this disparity did occur at random. It is implicit in the concept of a multiple regression that the importance it attributes to a variable, such as gender, is susceptible to varying degrees of certainty; except in the most extreme cases, there is always the possibility that any difference attributed to a given factor will actually be the result of chance. The chance that random selection will produce the same disparity is reflected in the statistical significance of the disparity — the greater the disparity, the less the chance it occurred at random, and the greater the statistical significance of the sex coefficient.

Thus, the value of the urn model is limited to graphically illustrating the uncertainty inherent in any multiple regression analysis. Ultimately, the simplicity that is its asset also severely limits its probative value.

In sum, on remand the district court should discount the weight to be accorded plaintiffs’ regression analysis because of the failure to include an explanatory variable only upon a showing by the defendant that the missing variable is a determinant of salary and correlates with sex, and thus is likely to cause a demonstrable, rather than an assumed, underadjustment bias. Any regressions should include rank as a variable, while inclusion of any other contested variables will depend on the facts relevant to that variable. The failure, by either side, to include a relevant variable (or the inclusion of an irrelevant or multi-collinear variable) will go to the probative value of the analysis, not its admissibility.

B. The Focus on Pre-1972 Hires.

A major focus of the district court on retrial should be on the disparity affecting female faculty members hired before 1972. But we do not foreclose plaintiffs from seeking to demonstrate salary disparities across the entire class of female AECOM faculty, if that still seems appropriate despite the trial judge’s view that “the appearance of discrimination in salaries during the relevant time period resulted from lower salaries paid to female faculty members who had been hired prior to 1972.” Id. at 1182.

Yeshiva, of course, may attempt to show that there was no salary disparity, even among pre-1972 hires, either before 1972 or since. It may also attempt to demonstrate that even if there was a pre-act disparity, it did not carry over into the actionable time period, after December 1974. To fully succeed, however, such a showing must be made not only as to the entire class, but also as to the pre-1972 hires.

To determine whether a valid Bazemore claim exists, the pre-1972 hires must be separately analyzed from the post-1972 group, because a study of the entire class, both pre- and post-1972 hires, would tend to mask any continuing effects of pre-act discrimination. Should the district court find that discrimination against pre-act hires explains the apparent general disparity between men and women, relief could accordingly be targeted to the pre-1972 hires.

C. Remand to a Different District Judge.

We reluctantly conclude that it is necessary to remand the case to a different district judge. We frankly are disturbed by the manner in which the district court treated this case on our initial remand. It is clear that, in light of Bazemore, the first trial was replete with error, and that a fresh look at the evidence was necessary on remand, along with an opportunity to supplement the record with new evidence relating to the pre-act period and its post-1974 consequences. We intend no criticisnf of the trial judge’s handling of the first trial; he conducted a thorough and searching inquiry after what must have seemed endless discovery, and made detailed findings in his opinion in Sobel I. It was only after Bazemore that his efforts to that point became inadequate.

While remanding to a different district judge is an “extraordinary remedy * * * [to] be reserved for the extraordinary case”, United States v. Robin, 545 F.2d 775, 784 (2d Cir.1976) (Timbers, J., dissenting), we believe that this case is, indeed, extraordinary. Cf. United States v. Spears, 827 F.2d 705, 709 (11th Cir.1987) (where district judge apparently evidenced bias against government, reassignment was ordered); In re Matter of Yagman, 796 F.2d 1165, 1188 (9th Cir.1986) (even where circuit court rejects contention district court was biased as ground for reversal, and does not doubt district judge’s “ability to act fairly”, remand to new judge “necessary to preserve the appearance of justice”); Davis & Cox v. Summa Corp., 751 F.2d 1507, 1523 (9th Cir.1985) (“A district court judge’s adamance in making an erroneous ruling may justify remanding the case to a different judge.”).

CONCLUSION

To summarize, the case is remanded for a retrial which shall include full consideration of plaintiffs’ claim that there were pre-1972 salary disparities carried over into the post-1974 limitations period because of Yeshiva’s failure to equalize salaries upon its being covered by Title VII, and reconsideration of plaintiffs’ claims for the entire class in light of all the evidence, with their regression analyses to be evaluated in accordance with Bazemore’s principles. There should be a reasonable period for additional discovery, if needed. We suggest that in the interest of economy the parties and the court cooperate in stipulating as to what evidence introduced at the first trial may be deemed part of the record onfhe retrial, to be considered for whatev-probative value it may have in light of Bazemore and this opinion.

The judgment of the district court is reversed and the case is remanded to the district court for further proceedings.  