
    Wayne R. METZ, Plaintiff-Appellant, v. TRANSIT MIX, INC., Defendant-Appellee.
    No. 86-2261.
    United States Court of Appeals, Seventh Circuit.
    Argued Jan. 16, 1987.
    Decided Aug. 28, 1987.
    
      Ivan Bodensteiner, Valparaiso, Ind., for plaintiff-appellant.
    Jere L. Humphrey, Kizer, Neu, Joyce Wyland Humphrey, Wagner & Gifford, Plymouth, Ind., for defendant-appellee.
    Before BAUER, Chief Judge, and CUDAHY and EASTERBROOK, Circuit Judges.
   CUDAHY, Circuit Judge.

The plaintiff Wayne Metz, age fifty-four, was discharged by his employer, defendant Transit Mix, Inc., after twenty-seven years of employment with the company. He alleges that he was fired in violation of the Age Discrimination in Employment Act (“ADEA”), 29 U.S.C.A. §§ 621-634 (West 1985 & Supp.1987). Following a bench trial, the district court entered judgment for Transit Mix, 646 F.Supp. 286 (N.D.Ind. 1986). For the reasons that follow, we reverse.

I.

Transit Mix is in the business of selling concrete to construction contractors. Metz worked for Transit Mix as manager of its plant in Knox, Indiana, a satellite of Transit Mix’s principal office and larger plant in Plymouth, Indiana. During the three years prior to Metz’s discharge, Transit Mix experienced financial problems which the district court attributed to the decline in the local construction business. In November 1983, Will Lawrence, the president of Transit Mix, notified Metz that due to Transit Mix’s poor sales, the Knox plant would be closed for the winter starting in December and Metz would be laid off. At that time, Lawrence had not decided whether he would close the Knox facility permanently or only for the winter.

In February 1984, Lawrence sent the assistant manager of the Plymouth plant, Donald Burzloff, to Knox to inspect the plant and make any necessary repairs. Burzloff obtained permission to take orders from the plant’s regular customers while he was there. Burzloff later requested that he be allowed to manage the Knox facility. Lawrence approved this request and in April 1984 discharged Metz.

At the time of his layoff in December 1983, Metz had an annual salary of $26,000, or about $15.75 an hour. He was among the highest paid of Transit Mix employees and, having worked for Transit Mix for twenty-seven years, was the second most senior employee there. Metz’s relatively high salary was a direct result of his many years of employment by Transit Mix; Lawrence testified at trial that Metz was given a raise each year, including years when Transit Mix was losing money. Burzloff was forty-three and had worked for Transit Mix for seventeen years when he replaced the fifty-four-year-old Metz as manager. Burzloff’s salary as manager was about $8.05 an hour.

II.

The ADEA prohibits employers from discriminating against employees on the basis of age. 29 U.S.C. § 623(a). Its objective in part is to promote employment of older workers on the basis of their abilities rather than their age. 29 U.S.C. § 621. The statute does not, however, prevent an employer from terminating an older worker based on reasonable factors other than age. 29 U.S.C. § 623(f)(1). When, as in the present case, a plaintiff is proceeding on a disparate treatment analysis, the plaintiff may recover only if the defendant in discharging the plaintiff was motivated by a discriminatory animus; that is, the plaintiff may recover only if his or her age was a determining factor in the employer’s decision.

Proving intentional discrimination is often difficult, so a plaintiff may do so by presenting either direct or indirect evidence of discrimination. Graefenhain v. Pabst Brewing Co., 827 F.2d 13, 17 (7th Cir.1987); Bechold v. IGW Sys., Inc., 817 F.2d 1282, 1284 (7th Cir.1987); LaMontagne v. American Convenience Prods., Inc., 750 F.2d 1405, 1409 (7th Cir.1984). In order to permit recovery for an ADEA claim through indirect means, this circuit has adopted a variation of the burden-shifting analysis set forth by the Supreme Court in the Title VII context for establishing a prima facie case of employment discrimination. See McDonnell Douglas Corp. v. Green, 411 U.S. 792, 93 S.Ct. 1817, 36 L.Ed.2d 668 (1973). As applied to an ADEA claim, this analysis requires that a plaintiff show that he or she: 1) belongs to the protected class (age forty or older); 2) was qualified for his or her position; 3) was terminated; and 4) was replaced by a younger person. After the plaintiff has established a prima facie case, the defendant employer then has the burden of presenting evidence that the plaintiff’s discharge was a result of “some legitimate, nondiscriminatory reason.” If the defendant meets this burden of production, the burden shifts to the plaintiff to prove that the reasons proffered by the employer for the discharge were merely a pretext for discrimination. Id. at 802-05, 93 S.Ct. at 1824-25; Graefenhain, 827 F.2d at 17-18; Bechold, 817 F.2d at 1284; LaMontagne, 750 F.2d at 1409. Throughout the trial, the burden remains with the plaintiff to prove there was discrimination, rather than with the employer to prove the absence of discrimination. LaMontagne, 750 F.2d at 1409.

The district court found that Metz had established a prima facie case of age discrimination. The court further found that a determining factor in Transit Mix’s decision to replace Metz with Burzloff was a desire to save the higher cost of Metz’s salary and that this factor “bore a relationship to Mr. Metz’s age.” 646 F.Supp. at 293. The court held, however, that this was not age discrimination in violation of the ADEA because it was based on an assessment of the cost of employing an individual employee, namely, Metz, rather than an impermissible assessment of the costs of employing Transit Mix’s older employees as a group. The sole issue on appeal is whether the salary savings that can be realized by replacing a single employee in the ADEA age-protected range with a younger, lower-salaried employee constitutes a permissible, nondiscriminatory justification for the replacement.

III.

Congress enacted the ADEA in response to the problems that the older worker faces in the job market, including the obstacles that the long-term employee encounters when he or she is suddenly without work. See generally Report of Secretary of Labor to Congress, The Older American Worker: Age Discrimination in Employment 11-17 (1965), reprinted in EEOC, Legislative History of the Age Discrimination in Employment Act, 16, 28-34 (1981). These difficulties have been attributed in large part to the worker’s development of firm-specific skills not easily transferable to a different job setting. National Commission for Employment Policy, 9th Annual Report, Rep. No. 17, Older Workers: Prospects, Problems and Policies 4 (1985). Therefore, while the older employee’s higher salary reflects the value of improved skills and the increased productivity that results, it is also indicative of one of the very problems the ADEA was intended to address: the likelihood that the employee will be less employable in other settings.

The ADEA has consistently been interpreted by the administrative agencies charged with its enforcement and the courts to prohibit an employer from replacing higher paid employees with lower paid employees in order to save money. The Equal Employment Opportunity Commission guidelines expressly provide that “A differentiation based on the average cost of employing older employees as a group is unlawful except with respect to employee benefit plans which qualify for the section 4(f)(2) exception to the Act.” 29 C.F.R. § 1625.7(f) (1986). This position is consistent with that adopted by the Department of Labor when it administered the ADEA:

It should also be made clear that a general assertion that the average cost of employing older workers as a group is higher than the average cost of employing younger workers as a group will not be recognized as a differentiation under the terms and provisions of the Act, unless one of the other statutory exceptions applies. To classify or group employees on the basis of age for the purpose of comparing costs, or for any other purpose, necessarily rests on the assumption that the age factor alone may be used to justify a differentiation — an assumption plainly contrary to the terms of the Act and the purpose of Congress in enacting it. Differentials so based would serve only to perpetuate and promote the very discrimination at which the Act is directed.

29 C.F.R. § 860.103(h) (1979) (emphasis added). Courts have also emphatically rejected business practices in which “the plain intent and effect ... was to eliminate older workers who had built up, through years of satisfactory service, higher salaries than their younger counterparts.” Leftwich v. Harris-Stowe State College, 702 F.2d 686, 691 (8th Cir.1983); see also EEOC v. Chrysler Corp., 733 F.2d 1183 (6th Cir.1984); Dace v. ACF Indus., Inc., 722 F.2d 374 (8th Cir.1983), aff'd on rehearing, 728 F.2d 976 (1984); Geller v. Markham, 635 F.2d 1027 (2d Cir.1980), cert. denied, 451 U.S. 945, 101 S.Ct. 2028, 68 L.Ed.2d 332 (1981). See generally 1 H. Eglit, Age Discrimination § 16.32 (1985).

Neither the district court nor Transit Mix on appeal takes issue with this interpretation of the ADEA in the context of policies that eliminate older employees as a group based on their higher salaries. Rather, they argue for a distinction based on whether the employer’s employment action, motivated by a desire to save costs, affects a group of employees or an individual employee. The district court held that while the former would be impermissible age discrimination, the latter is a legitimate, nondiscriminatory reason for replacing an employee. The court cited a treatise for support as follows:

“The relatively higher cost of employing older workers as a group is generally rejected as an RFOA [reasonable factor other than age]. The cost of employing an older worker when considered on an individual basis, however, may constitute an RFOA.” B. Schlei & P. Grossman, Employment Discrimination Law 506 (2d ed. 1983).

646 F.Supp. at 294. We find that this statement of the law, as interpreted by the district court, is inaccurate. Neither the policies behind the ADEA nor the relevant case law supports making this distinction and we find it to be an inappropriate distinction as applied to Metz’s claim.

The ADEA is aimed at protecting the individual employee. Section 623(a)(1) prohibits practices that “discriminate against any individual ... because of such individuals’s age.” (Emphasis added). The statute’s language indicates that it shares the same focus as Title VII legislation: “fairness to individuals rather than fairness to classes.” City of Los Angeles Dep’t of Water & Power v. Manhart, 435 U.S. 702, 709, 98 S.Ct. 1370, 1376, 55 L.Ed.2d 657, 666 (1978); see also Connecticut v. Teal, 457 U.S. 440, 453-54, 102 S.Ct. 2525, 2534, 73 L.Ed.2d 130, 141 (1982) (“The principal focus of [Title VII] is the protection of the individual employee, rather than the protection of the minority group as a whole. Indeed, the entire statute and its legislative history are replete with references to protection for the individual employee.”). The same ADEA policy concern that forms the basis for rejecting cost-based employer practices that have an adverse impact upon older workers as a group is present in the case of Metz’s discharge: Given the correlation between Metz’s higher salary and his years of satisfactory service, allowing Transit Mix to replace Metz based on the higher cost of employing him would defeat the intent of the statute.

This position is consistent with past decisions that have found in favor of employees’ ADEA claims as well as those that have found for the employer. In Leftwich, 702 F.2d 686, an employer defending an ADEA claim argued that although its employment selection plan had a detrimental disparate impact on older employees, the plan was justified because it was adopted as a cost-saving measure. The Eighth Circuit found that this cost justification did not establish a business necessity defense:

Here, the defendants’ selection plan was based on tenure status rather than explicitly on age. Nonetheless, because of the close relationship between tenure status and age, the plain intent and effect of the defendants’ practice was to eliminate older workers who had built up, through years of satisfactory service, higher salaries than their younger counterparts. If the existence of such higher salaries can be used to justify 'discharging older employees, then the purpose of the ADEA will be defeated.

Id. at 691.

Although Leftwich involved a disparate impact claim, the reasoning behind its holding can apply equally to a discriminatory treatment claim brought by an individual employee where, because of the high correlation between age and salary, it would undermine the goals of the ADEA to recognize cost-cutting as a nondiscriminatory justification for an employment decision. The Eighth Circuit itself applied the reasoning in Leftwich to an ADEA claim of discriminatory treatment brought by a single employee. Dace v. ACF Indus., Inc., 722 F.2d 374 (8th Cir.1983), aff'd on rehearing, 728 F.2d 976 (1984). In upholding a jury verdict in favor of the plaintiff, the court quoted the portion of Leftwich that we have reprinted above and characterized Leftwich as holding “that discrimination on the basis of factors, like seniority, that invariably would have a disparate impact on older employees is improper under the ADEA.” Id. at 378. In a third case, the Eighth Circuit found that although an employer has the right to abolish a position held by an older worker and combine that position’s responsibilities with the duties of a younger person, it distinguished such a situation from one in which “the position remained the same" and the employer knew the replacement would save money. Holley v. Sanyo Mfg., Inc., 771 F.2d 1161, 1168 (8th Cir.1985) (emphasis added). The court stated that there would be a much stronger claim for recovery in the latter case. Id.

A district court for the Eastern District of New York similarly held that cost-cutting is not a legitimate, nondiscriminatory reason for discharging an older employee while retaining younger, lower-paid employees. Marshall v. Arlene Knitwear, Inc., 454 F.Supp. 715 (E.D.N.Y.1978), aff'd in part, rev’d and remanded in part without opinion, 608 F.2d 1369 (2d Cir.1979). The court stated that, although in the absence of the ADEA this might have been a valid business justification, “Congress has decreed in the ADEA that an employee may not be discharged because of her age. Where economic savings and expectation of longer future service are directly related to an employee’s age, it is a violation of the ADEA to discharge the employee for those reasons.” Id. at 728. The court found that the plaintiff had proven her ADEA discriminatory treatment claim:

The evidence compels the conclusion that the savings in salary and the unpaid pension benefits accruing to defendants as a result of [the plaintiff’s] discharge were the controlling economic factors behind her termination. Since such economic factors are directly related to age, [the defendant’s] reliance on them to discharge [the plaintiff] constitutes age discrimination.

Id. at 730.

In Geller v. Markham, 635 F.2d 1027 (2d Cir.1980), cert. denied, 451 U.S. 945, 101 S.Ct. 2028, 68 L.Ed.2d 332 (1981), the Second Circuit held that a school board policy that limited teacher hiring to persons with less than five years’ experience violated the ADEA. The court further found that the plaintiff, an older teacher replaced under the school board’s policy, could recover on theories of both disparate impact, based on the plaintiff’s membership in a group unfairly affected by the policy, and disparate treatment, based on her individual replacement by a younger teacher. The court, citing Marshall approvingly, rejected the defendants’ defense that the policy “was supportable as a necessary cost-cutting gesture in the face of tight budgetary constraints.” Id. at 1034.

The Sixth Circuit has held that “the prospect of imminent bankruptcy” may qualify as a “reasonable factor other than age” and thus justify, for example, a forced retirement policy. EEOC v. Chrysler Corp., 733 F.2d 1183, 1186 (6th Cir.1984). The court described two tests that the employer must meet to establish a reasonable-factor-other-than-age defense based on the economic needs of a failing company. “First, the necessity for drastic cost reduction obviously must be real____ Second, the forced early retirements must be the least-detrimental-alternative means available to reduce costs.” Id. Even if we were to adopt a similar economic necessity exception in the present case, Transit Mix would not satisfy this two-part test. We are not convinced that Transit Mix’s financial solvency was sufficiently in jeopardy to meet Chrysler’s first requirement. More important, Transit Mix clearly has not satisfied the second requirement. Transit Mix did not pursue obvious less-detrimental alternatives to replacing Metz, such as offering Metz continued employment at a lower salary or in a different position. The district court expressly found that Transit Mix “did not ask Mr. Metz to take a pay cut or to take a different job within the company.” 646 F.Supp. at 290.

IV.

The dissent presents a number of interesting insights into the nature of age discrimination and the role of productivity as a legitimate factor in employment decisions. But, while sweeping in its approach, the dissent fails to come to grips with the specific facts of this case.

Metz’s relatively high salary was the result of annual raises that were given to him by Transit Mix regardless of how the company was doing financially. See supra note 2. Metz’s salary therefore reflected his twenty-seven years of service to Transit Mix. When Lawrence, the president of Transit Mix, decided that the company's poor performance no longer justified the salary that the company had given Metz, Lawrence replaced Metz because of that salary without first asking Metz to take a pay cut. Given these facts, Lawrence’s desire to save costs was not a permissible, nondiscriminatory reason for replacing Metz with the younger, less-costly Burzloff; by thus replacing Metz, Transit Mix violated Metz’s rights under the ADEA.

We, of course, recognize that our use of pay as a “proxy” for age, although inescapable in this particular case, is of limited application and may be employed only on a case-by-case basis where the facts support its use. We do not agree with the dissent that cross-sectional studies of pay in relation to age have much value here. There are any number of reasons why the average fifty-five-year-old might be earning less than the average forty-year-old. For example, as the dissent suggests, see infra p. 1218, younger employees as a group may be better educated and therefore better qualified when entering the workforce than are older employees. Employees may also invest more time and resources in improving their skills through training and education during their early years of employment. Employees may choose less demanding, and therefore lower paying, work as they grow older. In addition, many high-paying jobs require strength, speed, dexterity, endurance and other physical attributes and may even be compensated on a piece-work basis. At fifty-five many employees may be physically disqualified from or limited in high-speed, physically demanding tasks in such places as automobile plants or packinghouses. They may by that age have been down-graded to janitors. And there are not many fifty-five-year-olds playing major league baseball. By age fifty-five many people may have been laid-off or discharged from formerly high-paying factory or transportation jobs and may find work as security guards. Finally, age discrimination on the part of employers may account for some of the decline in the average salary of older workers. In any event, no matter what the facts, only federal judges under the Constitution have guaranteed earnings regardless of productivity until they die.

In the case of Metz, however, the facts are much narrower. He and Burzloff were both plant managers — apparently of equivalent competence. Their work is of the sort where declining physical effectiveness through aging is not apparently of consequence and may be more than offset by growth in experience. The facts suggest, as is usual with this type of work, that seniority is a factor in compensation and age and seniority are, of course, strongly correlated. Metz is paid more — as are most middle managers — because he has been there longer. There may be other reasons for the pay disparity but certainly seniority is an important one.

The dissent postulates output or productivity per wage dollar as a legitimate factor in discharge decisions. The dissent is then able to equate high pay with low productivity per wage dollar and thereby legitimate high pay as a reason for lay-off. The dissent maintains that since Metz, who is senior, is paid more because of his seniority (age), he may be fired for that reason alone. Because of his higher pay, awarded for seniority, he is automatically less productive per wage dollar and therefore becomes subject to termination. By this way of thinking, seniority (and hence age) is translated into a perfectly acceptable excuse for firing everyone who receives seniority pay raises.

Thus, if a company has twenty foremen, all of exactly equal ability, and the oldest ten make more money than the others because their average seniority is much higher, according to the dissent the employer would have a complete defense to an age discrimination charge when it fires the ten graybeards. In middle management jobs we would expect pay to reflect seniority and hence to be something of a proxy for age. This is how the civil service works and private industry usually is not much different. To accept the approach of the dissent is to make totally vulnerable the employees who are paid a little more because they have been with the company a little longer. All this has nothing to do with whether older employees across the economy make more or less on average than younger ones (which would presumably be revealed by cross-sectional analysis).

Nor do we accept the view of the dissent that discharge and reduction in pay must be regarded as equivalents under the ADEA.for the purposes of this case. After all, discharge is “the industrial equivalent of capital punishment.” Complete Auto Transit, Inc. v. Reis, 451 U.S. 401, 421, 101 S.Ct. 1836, 1847, 68 L.Ed.2d 248, 268 (1981) (Powell, J., concurring in part and concurring in judgment) (quoting Whitman, Wild Cat Strikes: The Union’s Narrowing Path to Rectitude?, 50 Ind.L.J. 472, 481 (1975)). And, as the dissent makes clear, economic imperatives must be continually balanced against the requirements of the age discrimination law. At least two things are clear: most older employees (who have difficulty getting new jobs) would prefer a wage reduction to being fired, and many employers, knowing of the morale problems created by wage cuts, would prefer to terminate older employees rather than have them remain at work with their morale in serious disarray because their pay was reduced. For this reason, we think general pay reductions are less a threat to senior employees than terminations would be (in part because employers are less likely to cut pay unless economic circumstances absolutely require it). Certainly, however, in the case before us, we lay down no general rules about what circumstances might justify pay cuts for older employees. We only suggest that the language of the statute does not require that in this case we regard discharge or reduction in pay as the same thing (although they may have economic similarities and, under proper circumstances, they can both result in a successful ADEA claim). It is common knowledge that older employees tend to protect their jobs at all costs— even at the cost of a reduction in pay.

The essential problem with the dissent’s approach is that pay for middle management jobs is, at least in the short run and within the broad limits of competition, under the control of the employer. The logic of the dissent’s position is that an employer may reward years of service for middle management employees with raises in the paycheck. If this is the practice, as it frequently is, when the middle managers reach age fifty or sixty, they may all be terminated since all will be making more money than younger managers with equivalent jobs. If we assume that all managers at a given level are of equivalent proficiency, as we must for purposes of analysis in the instant case, under the dissent’s analysis the managers who are paid the most are by definition the least productive per wage dollar. Through its control over productivity per wage dollar, the management would effectively decide who could be terminated as its employees reach a relatively advanced age.

The dissent’s approach to “productivity” as a rationale for discharge is inconsistent with the policies chosen by Congress in enacting the ADEA. As this circuit has previously recognized, the ADEA imposes some costs on employers and deprives employers of some decisionmaking autonomy in order to treat our nation’s older employees fairly:

[Although the ADEA does not hand federal courts a roving commission to review business judgments, the ADEA does create a cause of action against business decisions that merge with age discrimination. Congress enacted the ADEA precisely because many employers or younger business executives act as if they believe that there are good business reasons for discriminating against older employees. Retention of senior employees who can be replaced by younger, lower-paid persons frequently competes with other values, such as profits or conceptions of economic efficiency. The ADEA represents a choice among these values. It stands for the proposition that this is a better country for its willingness to pay the costs for treating older employees fairly.

Graefenhain, 827 F.2d at 21 n. 8 (emphasis in original).

The dissent mentions the higher cost of some fringe benefits for older employees, which is noted in the legislative history of the ADEA. The cost of some fringe benefits does increase with age and it might be said that the cost of these benefits reduces the productivity per fringe dollar of older employees. For example, after fifty, employees may incur higher costs for the provision of health insurance and health care and, under most benefit plans, more senior employees are entitled to longer vacations. But it has not been argued that these higher costs, and by hypothesis lower productivity per dollar, should be reason for exposing older employees to discharge in the face of the age discrimination law. There is even less reason for firing because of higher salaries than because of higher fringes. Salaries are, within a substantial range, in the control of the employer, while fringes — medical costs, for example — may not be. Hence, as a basis for discharge we believe these cost factors must be evaluated critically.

We are, of course, aware that employers must control costs if they are to remain competitive and that this imperative of survival will inevitably create tensions with the legal prohibitions against age discrimination. We think it would be unwise, however, to translate this imperative into a rule that an older employee can be fired and replaced by an equally proficient younger employee merely because the older employee happens to be earning more money at the moment. There are a number of less burdensome measures that can be introduced if necessary before “industrial capital punishment” is brought into play. We therefore reverse the judgment of the district court and remand for a determination of the appropriate relief.

Reversed and Remanded.

EASTERBROOK, Circuit Judge,

dissenting.

Transit Mix laid off Metz and closed the Knox cement plant, which he was managing, because that plant’s sales were insufficient to justify its operation. Five months later it reopened the plant under the management of Burzloff, who made a little more than half of Metz’s $26,000 salary. The district court found that “Metz’s salary was too high to justify in light of the poor performance of the Knox plant.” The court also found that because of “differences of opinion and style between Mr. Metz and those who populate the Plymouth plant [Transit Mix’s other, larger, plant], it was legitimate and non-discriminatory” not to employ Metz at Plymouth or to ask Metz to put the Knox plant back in operation. Burzloff, who had worked for 18 years at the Plymouth plant, could be recalled to Plymouth if Knox should prove unprofitable again; Metz could not be detailed to Plymouth. The district court concluded “that each of these reasons — the greater flexibility afforded by Mr. Burzloff and the salary savings — was a determining factor in the decision to terminate Mr. Metz.” My colleagues do not hold that any of these findings is clearly erroneous. See Pullman-Standard v. Swint, 456 U.S. 273, 287-90, 102 S.Ct. 1781, 1789-91, 72 L.Ed.2d 66 (1982).

I

The district court expressed the view that Transit Mix was entitled to take Metz’s salary into account. 646 F.Supp. 286, 291-94 (N.D.Ind.1986). The majority disagrees. But we review judgments, not opinions, and it is hard to see how Metz’s salary mattered. The district court found that both salary and flexibility were “determining” factors. It is clear from the context that the court meant sufficient rather than necessary conditions. That is, Transit Mix was not going to reopen the Knox plant unless its manager could work at Plymouth too. My colleagues do not disturb this finding. Because causation is an essential part of the plaintiff’s burden in a disparate treatment case, see Dale v. Chicago Tribune Co,, 797 F.2d 458, 462 (7th Cir.1986); Sherkow v. Wisconsin, 630 F.2d 498, 502 (7th Cir.1980), Metz loses no matter what we make of the district court’s approval of the salary business. So although I see why my colleagues disagree with the district court’s opinion, I do not understand why they disagree with its judgment.

The district court also found that Transit Mix was not going to pay $26,000 to a manager at Knox, because the plant did not do enough business to support such a salary. The sales of the Knox plant were less than $300,000 in each of 1982 and 1983, falling from earlier levels. “Metz’s salary was too high to justify in light of the poor performance of the Knox plant.” We have held, in common with every other court to consider the issue, that a firm may lay off or fire employees of any age when economic conditions make that prudent. E.g., Tice v. Lampert Yards, Inc., 761 F.2d 1210 (7th Cir.1985); Dorsch v. L.B. Forster Co., 782 F.2d 1421 (7th Cir.1986); Sahadi v. Reynolds Chemical, 636 F.2d 1116 (6th Cir. 1980); Price v. Maryland Casualty Co., 561 F.2d 609 (5th Cir.1977). “Economic conditions” implies a comparison of the employees’ wages with their product; a plant that is unprofitable when the average wage is $20 per hour may be a bonanza for the firm when the average wage is $10. If Transit Mix had said: “We are losing money at Knox, in part because of your high salary, so we are closing that plant”, Metz could not have complained. If Transit Mix had known what the court holds today— that it is forbidden to replace Metz with another employee at lower salary — it would have kept the Knox plant shuttered. Metz still would be out of work. He would have been fired had he been 35 years old and everything else the same. This is age discrimination?

Metz’s victory today is Pyrrhic — not for him, but for older employees in general. The court tells employers to keep their plants closed. Throw overpaid employees out of work because their salaries are high (as Tice permits) but don’t you dare hire anyone else at a lower salary to do the work. If that rule were widely followed, the Metzes of the world would be no better off, and the Burzloffs (also in the protected age group) would be worse off. They would be denied advancement, and other employees whom the Burzloffs would manage at Knox would never be hired. If Congress wants such a stultifying result, if Congress wants to hurt older workers, so be it. But judges should not go out of their way to injure protected groups. The ADEA as it exists does not prohibit consideration of the relation between an employee’s salary and his productivity.

II

My colleagues’ treatment of “wage discrimination” under the ADEA has the support of several other courts. Fair arguments may be made on both sides. But I am persuaded that my brethren, and these other courts, have settled on an approach that is too broad, and I shall try to explain why. Wage discrimination is age discrimination only when wage depends directly on age, so that the use of one is a pretext for the other; high covariance is not sufficient, and employers always should be entitled to consider the relation between a particular employee’s wage and his productivity.

Section 4(a) of the ADEA, 29 U.S.C. § 623(a), provides in part that it is unlawful for an employer:

(1) to ... discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual’s age; [or]
(2) to limit, segregate, or classify his employees in any way which would deprive or tend to deprive any individual of employment opportunities or otherwise adversely affect his status as an employee, because of such individual’s age;

Section 4(f)(1), 29 U.S.C. § 623(f)(1), adds that it is lawful “to take any action ... where the differentiation is based on reasonable factors other than age”. A natural reading is that an employer may take into account wages, which are “factors other than age”. Many people under 40 (the lower bound of the protected group) earn $26,000 or more; if such a salary exposes them to discharge on economic grounds, then it should expose older employees to discharge. You do not get immunity from an otherwise lawful employment decision by growing old. As my colleagues say, the “ADEA is aimed at protecting the individual employee” (Maj. op. at 1206), but what it protects each employee against is age discrimination. The Act prohibits adverse personnel actions based on myths, stereotypes, and group averages, as well as lackadaisical decisions in which employers use age as a proxy for something that matters (such as gumption) without troubling to decide employee-by-employee who can still do the work and who can’t. The ADEA does not protect anyone against decisions based on actual performance.

The contrary view starts from the belief that wage and age are correlated. But age and ability also are correlated. For many years employees add to their skills and as a result do better work; eventually the tables turn, as mental and motor skills slip away. This proceeds at different paces for different people; the ADEA ensures that employers examine each employee’s actual performance rather than the average performance of a group defined by age. No one doubts, however, that an employer may discharge an employee, of any age, who no longer performs the job with acceptable skill. But one could say about performance on the job exactly what my colleagues say about wages: a test based on performance hurts the old relative to the young. Does it follow that this adverse impact makes inquiry into performance impermissible?

The customary response is that no one is protected by the ADEA unless qualified for the job. An older employee whose skills have diminished is not qualified. Yet there are degrees of skill; an employee is not “qualified” one day and “unqualified” the next. In business the question is not “is Jones qualified?” but “can Jones do the job well enough to cover his wage?” A welder good enough to work on simple sheet metal at $10 per hour may be unqualified for a welding job, paying $30 per hour, in a nuclear plant or on a bridge where lives depend on the quality of the joint and other, better welders compete for the position. There is no “qualified welder” in the abstract, and there is no “qualified manager of a cement plant” either. To say that someone is “qualified” to manage the Knox plant is to say that he can handle the manufacture and sale of concrete well enough that he adds to the value of the enterprise at least the cost of his salary. If he cannot do this, he is unqualified for the particular job at the particular time. It is therefore not possible to divorce the ability to do a job from the wage demanded. If the ADEA allows employers to make decisions based on performance — surely it does, even though performance is systematically related to age — then it also allows employers to make decisions based on the interaction of performance and wage. If the wage is too high for the performance, the employer may act.

My colleagues concede as much when they say (Maj. op. at 1208, 1209-10) that Transit Mix could have cut Metz’s salary. Cases such as Tice and Dorsch hold that employers also may fire workers whose productivity does not justify their wage. If these things are true, however, then the rule the majority creates — that employers cannot act on the basis of salary — cannot be right. More, the language of the ADEA will not sustain a difference between firing an employee based on salary (which my colleagues think forbidden) and reducing an employee’s salary based on salary (which my colleagues think OK). The premise of the court’s opinion is that wage is the equivalent of age, and to treat an employee adversely because of his high wage is illegal because it has a disproportionately large effect on older employees. Reducing the salary of higher-paid employees also affects older employees adversely, and therefore should be equally illegal. Section 4(a)(1) lists compensation as one forbidden ground. It would be a shocking violation of the ADEA to reduce by 50% the wages of all employees 50 and up; yet my colleagues suggest that Transit Mix should have done just that to Metz. Discharge and a reduction in salary are treated the same under § 4(a)(1). If one is off limits, so is the other; neither the language nor the structure of the Act creates the sort of distinction my colleagues suggest. One of the principal reasons for enacting the ADEA was a belief that people dismissed at advanced ages cannot obtain jobs at equivalent pay elsewhere; most employees care about the discharge because of its financial consequences, not because of a sentimental attachment to their employers; discharge is really no different from a reduction in income to the salary paid by the next employer; yet the court ironically says that it is fine to reduce the pay of older employees, so long as it is reduced “at home.”

My colleagues view salary reduction as a less restrictive alternative and therefore preferable. Perhaps it is, but we have held that the ADEA does not require employers to use less restrictive alternatives such as offering employees other jobs rather than firing them. E.g., Tice, 761 F.2d at 1217. The disparate impact model of race discrimination law on which my brethren draw instead demands that tests or devices with disparate effects be validated or justified by a business necessity. See Griggs v. Duke Power Co., 401 U.S. 424, 431, 91 S.Ct. 849, 853, 28 L.Ed.2d 158 (1971); Dothard v. Rawlinson, 433 U.S. 321, 328-32, 97 S.Ct. 2720, 2726-28, 53 L.Ed.2d 786 (1977). The majority waters down the disparate impact approach even as it borrows.

If we apply disparate impact analysis rigorously to decisions based on wages, we will require some fundamental changes in the operation of American business. A firm may not close a plant or curtail its operations on the basis of high wage costs. At a minimum, the court must determine whether a general wage reduction would have restored the plant’s profitability. A firm may not give lower-paid employees a wage increase without doing the same for higher-paid employees. Many times an increase will help the lower-paid employees catch up with others; if wage discrimination is age discrimination, this differential increase is presumptively unlawful. (I pass the question whether the ADEA would require an equal percentage increase or an equal dollar increase — whether, indeed, an across-the-board percentage increase might be called discrimination against the younger employees in the protected group who receive lower absolute dollar increases.) In times of corporate austerity, firms may freeze or reduce the salaries of their managers and other well-paid employees; no more, because that is age discrimination.

Ill

I would accept all of this if the ADEA required it. The language of the Act does not, however, and neither does the analogy to disparate impact cases under Title VII of the Civil Rights Act of 1964. It is time to unscramble the strands of doctrine involved in this and similar cases.

Anti-discrimination law uses two forms of inquiry: disparate treatment and disparate impact. (This ugly use of “impact” instead of “effect” is ingrained, and I follow the convention.) The plaintiff in a disparate treatment case contends that the employer treated him adversely because of a forbidden characteristic. He must make a prima facie case, see McDonnell-Douglas Corp. v. Green, 411 U.S. 792, 93 S.Ct. 1817, 36 L.Ed.2d 668 (1973), after which the employer must articulate (but not prove) a neutral explanation for its action, see Texas Department of Community Affairs v. Burdine, 450 U.S. 248, 101 S.Ct. 1089, 67 L.Ed.2d 207 (1981). The plaintiff then must show that the employer treated him adversely because of the prohibited characteristic. This means intent and causation. Postal Service Board of Governors v. Aikens, 460 U.S. 711, 714-15, 103 S.Ct. 1478, 1481, 75 L.Ed.2d 403 (1983). Once the evidence is in, the presumption created by the prima facie showing “drops from the case”, Burdine, 450 U.S. at 255 n. 10, 101 S.Ct. at 1095 n. 10. See also Kier v. Commercial Union Insurance Co., 808 F.2d 1254, 1257 (7th Cir.1987); Morgan v. South Bend Community School Corp., 797 F.2d 471, 480 (7th Cir.1986). The disparate treatment model, designed for handling individual cases of discrimination, makes it easy for the plaintiff to get to court (the prima facie case is not demanding) and requires an employer to produce an explanation; but once the employer does this the employee faces stiff burdens on intent and causation.

The plaintiff in a disparate impact case complains not about what happened to him (though that may play some role) but about a test or device the employer is using. The plaintiff shows that the test or device selects by race (or other prohibited characteristic). The employer then must show that there is a good reason for this selectivity. One good reason may be the “validity” of the test — that is, its ability to predict performance on the job or its association with some trait essential for the job. The other approach is “business necessity”; the employer tries to show that it can’t live without the test or practice. The disparate impact model is addressed to class-wide discrimination, usually unintentional. The plaintiff bears a heavy burden and often must amass a great deal of data (and subject it to expert statistical analysis) to show the disparate impact. The defendant has a correspondingly high burden once the plaintiff shows that the test or device systematically filters out members of the protected group.

Metz filed and litigated this case under the disparate treatment model. He was accordingly required to show intent and causation — which, as I have pointed out, he failed to do. My colleagues bail him out by merging the two models. They allow him to get into court with a prima facie case rather than with the daunting statistical showing of a class-wide disparate impact. Then they require the employer to refute this ersatz disparate impact case; it is not enough, they say, for the employer to advance a legitimate reason. And at the end, they conclude, the trier of fact must (not just may) infer intent from the unrefuted disparate impact case, so that Metz prevails. This mixture gives Metz the benefit of the easy parts of both models. Only by using the aspects of the disparate treatment and disparate impact routes most favorable to plaintiffs, and discarding the aspects of each approach favorable to employers, does the court find a violation in today’s case.

The two methods of proof should be kept separate. They are built on different premises: disparate treatment on the premise that employees are identical, so that differential treatment must be attributed to use of the prohibited characteristic, and disparate impact on the premise that because of a history of discrimination employees are different, so that employers must be prevented from using arbitrary tests and devices that play on that regrettable difference without advancing any legitimate interest. Putting the two theories together yields nothing but confusion. See Douglas Laycock, Statistical Proof and Theories of Discrimination, 49 L. & Con-temp.Prob. 66 (Aut.1986). This case shows why.

As a disparate treatment case, Metz’s claim falls short. Transit Mix articulated two justifications other than Metz’s age: the need to have a manager at Knox who also could work at Plymouth, and the relation between Metz's salary and the revenues of the Knox plant. The district court credited the first as an accurate and sufficient reason. The second, too, was a sufficient reason; although perhaps related to Metz’s age in a statistical sense, such relatedness does not show discriminatory intent in a disparate treatment case. As a disparate impact case Metz’s claim is equally weak. A disparate impact claim depends on groupwide adverse effects, which Metz never offered to show. It also depends on the outcome of a validation study and an inquiry into business necessity, which no court has conducted.

The consequences of scrambling the two models are most apparent in the court’s treatment of intent. The plaintiff in a disparate treatment case must show discriminatory intent. There is no finding that Transit Mix treated Metz adversely because of his age; there is only a finding that Transit Mix considered something that is correlated with age. Yet the majority allows disparate impact to substitute for intent, although all the disparate impact cases reflect the belief that disparate impact and intent are different. They allow liability in the absence of discriminatory intent. See Griggs, 401 U.S. at 431-32, 91 S.Ct. at 853-54. Intent means doing something because of, not in spite of, a particular consequence. Personnel Administrator v. Feeney, 442 U.S. 256, 279, 99 S.Ct. 2282, 2296, 60 L.Ed.2d 870 (1979). That means using wage to get at age. Metz did not claim that Transit Mix used his wage as a smokescreen; the record shows, and the district court found, that Transit Mix used Metz’s wage with indifference to his age, rather than because of it. Feeney and Washington v. Davis, 426 U.S. 229, 239-45, 96 S.Ct. 2040, 2047-50, 48 L.Ed.2d 597 (1976), reject the equation between disparate impact and intent on which my colleagues’ conclusion depends. Both cases reverse decisions that had equated the two, or used disparate effect as the sole basis of inferring intent.

The two approaches are related in the sense that if in a disparate impact case a court declares that a particular employer may not insist that bricklayers have high school degrees, that employer could not respond to a later disparate treatment claim by saying “I did not hire Smith because he lacked a high school degree.” But it is not the law that if Duke Power Co. cannot demand a high school degree of its janitors, Boeing cannot demand a high school degree of its engineers. Each new job, each new employer, requires a separate inquiry. My colleagues have fused disparate treatment and disparate impact rules in such a way that one employer’s loss in a disparate impact case means that no employer can use a particular ground of decision in a disparate treatment case. The Supreme Court has declined to allow this fusion. For example, Furnco Construction Corp. v. Waters, 438 U.S. 567, 98 S.Ct. 2943, 57 L.Ed.2d 957 (1978), held that although the employer had used a number of arbitrary and subjective factors in hiring employees, factors that may well have had a disparate impact on minority applicants, any given applicant’s suit still had to satisfy the requirements of a disparate treatment case. See also Teamsters v. United States, 431 U.S. 324, 367-71, 97 S.Ct. 1843, 1870-73, 52 L.Ed.2d 396 (1977). Once a plaintiff makes out a class-wide pattern, the court may shift to the employer the burden of showing that the discrimination did not affect the plaintiff personally; until then, each plaintiff must carry the burden on intent and causation. See Furnco, Burdine, and Aikens. The melding of the two strands of discrimination law effectively relieved Metz of his burden — indeed has allowed him to prevail even though the employer advanced, and the trier of fact credited, a sufficient reason utterly unrelated to his age. This unfortunate outcome is the wages of conceptual confusion.

IV

Perhaps, however, we could abandon the disparate treatment model in cases of this sort. I now inquire how Metz should fare if we were to explore that possibility. The ADEA was enacted in 1967, before the first of the disparate impact cases (Griggs, in 1971), so we cannot be confident that the Act adopts this method. Let us assume for the moment, however, that because of the parallel language in Title VII and § 4(a) of the ADEA this approach governs and ask whether it applies to decisions based on the relation between an employee’s wage and his productivity. I return toward the end of this opinion to the question whether there should be a disparate impact model in ADEA cases.

Griggs, which creates disparate impact analysis, identifies as the source of concern “practices, procedures, or tests neutral on their face” (401 U.S. at 430, 91 S.Ct. at 853) that affect groups differently. These tests or practices might be unrelated to any legitimate need of the business, and if so would be the kind of “artificial, arbitrary, and unnecessary barriers to employment” (id. at 431, 91 S.Ct. at 853) that are “discriminatory in operation” (ibid.). Griggs gives the employer two options when a test or practice affects a group of employees adversely: to “validate” the test (that is, to show that it predicts performance on the job and hence is not “arbitrary” and “unnecessary”) or to show “business necessity” (ibid.). If the test or practice is not “unrelated to measuring job capability” (id. at 432, 91 S.Ct. at 854) or bears a “manifest relationship to the employment in question” (ibid.) or is necessary, it may be used; otherwise not.

Since Griggs both the EEOC and the courts have produced an ocean of regulations and opinions trying to define what validity means and how necessitous an employer need be to use an unvalidated test. E.g., New York City Transit Authority v. Beazer, 440 U.S. 568, 587 n. 31, 99 S.Ct. 1355, 1366 n. 31, 59 L.Ed.2d 587 (1979); Dothard, 433 U.S. at 328-37, 97 S.Ct. at 2726-31; Davis, 426 U.S. at 249-52, 96 S.Ct. at 2052-53; Albemarle Paper Co. v. Moody, 422 U.S. 405, 425-36, 95 S.Ct. 2362, 2375-80, 45 L.Ed.2d 280 (1975). Sometimes validation requires careful statistical testing; sometimes the job-relatedness of a test or device is apparent. Beazer, for example, holds that a subway system may decline to hire methadone users, although that has a racially disparate impact, because freedom from drugs is obviously related to safety. We held in Aguilera v. Cook County Police and Corrections Merit Board, 760 F.2d 844 (7th Cir.1985), that a government may make a high school degree a requirement for jobs in the police or the jail guard forces — even though Griggs itself dealt with a high school degree rule— because a minimum level of education is essential in such jobs.

Disparate impact analysis under Griggs has three steps: (1) identifying a test, device, or practice; (2) establishing that the test, device, or practice adversely affects a group protected by the statute; (3) assessing the validity or business necessity of a test that has disparate impact. The application of this approach to wage discrimination encounters problems at each step.

Where is the class-wide test, practice, or device? Decisions based on the relation between the value of the employee’s work and the pay he receives for it are scarcely arbitrary; to the contrary, they are essential in every business. This is individualized decisionmaking, the opposite of the rote and pointless tests the Supreme Court had in mind in Griggs. It is not a test, device, or practice at all.

Where is the disparate impact of considering wages? It is true, as my colleagues observe, that the average employee’s income tends to increase with age. Some employees regress (for example, lawyers earn less as judges than in the practice) but the usual direction is up. See Gary S. Becker, Human Capital: A Theoretical and Empirical Analysis, with Special Reference to Education 219 (2d ed. 1975) (time series analysis of people who completed their education in a single year); Jacob Mincer, Schooling, Experience, and Earnings 64-82, 101 (1974) (many time series analyses for different levels of education). This may occur because people do better work as time goes by, because they are better matched to their jobs, see Boyan Jovanovic, Job Matching and the Theory of Turnover, 87 J.Pol.Econ. 972 (1979), or because of other factors. Robert Topel, Job Mobility, Search, and Earnings Growth: A Reinterpretation of Human Capital Earnings Functions, 8 Research in Labor Economics 199 (1986).

The change in each person’s income with time is not the sense relevant to disparate impact analysis, however. We want to know whether wages for the body of employees rise with age: that is, do 50-year-old employees earn more than 40-year-olds at any given moment? Would an employer, bent on slashing costs, find older employees’ wages the most attractive target? This is a question about the profile of an employer’s wage bill by the age of its employees. The wage-age profile in a cross-sectional analysis (that is, the data obtained from a snapshot of everyone’s wages at an instant of time) shows wages rising through age 40 and thereafter declining. E.g., Lloyd G. Reynolds, Stanley H. Masters & Colletta H. Moser, Labor Economics and Labor Relations 234 (9th ed. 1986) (national cross section of annual wages of all employees in 1982). In 1982, employees age 60 had roughly the same average income as employees age 30. This is so in part because younger employees are better educated and therefore start at higher wages than employees did a generation ago. Data from 1981, reproduced in the National Commission for Employment Policy’s Ninth Annual Report: Older Workers: Prospects, Problems and Policies 16 (1985), have a similar pattern, though they show that employees aged 40-44 have the highest earnings. See Figure 1. The Bureau of Labor Statistics’ estimate for March 1987 has the same pattern. BLS News (Apr. 28, 1987).

No matter why the cross-section looks as it does, however, the foundation for a disparate impact analysis is shaky; the most one can say is that the average wage at (say) 50 is higher than the average wage at 25, even though the cross-section shows leveling off or decline after age 40. The data for hourly wages show much the same pattern. Census data from 1984 covering workers paid by the hour give median wages as follows:

Ages Hourly Wage

16-19 $3.64

20-24 $4.94

25-29 $6.52

30-34 $7.23

35-39 $7.37

40-44 $7.17

45-49 $7.23

Ages Hourly Wage

50-54 $7.20

55-59 $6.85

60-64 $6.45

65-69 $4.95

70+ $4.38

Department of Labor, Bureau of Labor Statistics, Monthly Labor Review 22 (Feb. 1986) (median covers all races and both sexes). No evidence in the record of this case shows that the wage-age profile at Transit Mix has an upward slope. If Metz wanted to use a disparate impact analysis, he should have built the statistical foundation. It is not appropriate to take judicial notice of a wage-age profile that creates a disparate impact problem — especially when the court is “noticing” something that is not true. See Fed.R.Evid. 201(b)(1) (judicial notice appropriate only when the fact is “not subject to reasonable dispute”).

Finally, what happened to the search for validation or business necessity? Griggs does not condemn all tests or devices with disparate impact; it forbids only those that are not valid (job-related) or supported by strong business reasons. It is hard to imagine how the use of wages could not be valid; wages correspond precisely to the costs of doing business, and hence to profitability. We might have a validity problem if an employer tried to slash wages without regard to the employee’s performance. For example, a wage reduction for all employees over 60 could reflect the stereotypical belief that no one over 60 can do the job well. This is the kind of concern reflected in EEOC v. Chrysler Corp., 733 F.2d 1183 (6th Cir.1984), on which the majority relies. But a use of wage in relation to job performance — which is how Transit Mix used Metz’s wage — is “valid” almost by definition. This is why the district court was right to see a difference between across-the-board decisions and employee-by-employee decisions. My colleagues, however, have written validation out of the disparate impact test. Their opinion does not suggest that it matters whether Transit Mix had a sound business reason for taking Metz’s wages into account. Indeed, the holding of this case is that an employer may not replace an employee with one willing to accept lower pay, even though it has a sound business reason. Nothing in Griggs or any other case in the Supreme Court’s disparate impact sequence supports this.

If neither the text of the ADEA nor the disparate impact cases under Title VII support Metz, what about the ADEA’s legislative history? Little of the history is pertinent. None of the committee reports discusses the extent to which employers may take salary into account in making decisions. To the extent the legislative history addresses the subject, it suggests that employers may consider the costs of hiring older employees — and § 4(f)(2), 29 U.S.C. § 623(f)(2), writes into the statute the permission to use age as a ground of decision when costs so dictate. Section 4(f)(2) provides that employers may use age in, for example, designing insurance plans: term life insurance costs much more for 65-year-old employees than for 25-year-old employees, and § 4(f)(2) permits employers to consider that in designing packages of benefits. Senator Javits’s additional statement in the Senate Report, S.Rep. 90-723, 90th Cong., 1st Sess. 14 (1967), applauds § 4(f)(2) on the ground that without it “employers might actually have been discouraged from hiring older employees because of the increased costs involved in providing certain types of benefits to them.” See also, e.g., EEOC v. Borden’s, Inc., 724 F.2d 1390, 1395-96 (9th Cir.1984) (discussing the legislative history of § 4(f)(2)); Hearings Before the Subcommittee on Labor of the Senate Committee on Labor and Public Welfare, 90th Cong., 1st Sess. 30 (1967) (statement of Sen. Smathers, co-sponsor of the ADEA).

The assumption behind § 4(f)(2) is that without an explicit privilege to use age in the design of welfare and pension plans, the higher costs of fringe benefits for older persons would be a legitimate reason not to employ them. Secretary of Labor Wirtz, the Johnson Administration’s chief spokesman on the ADEA, made that point explicitly in both the Senate and the House hearings. See Senate Hearings at 49 (the higher costs of training an older employee for a job would be a “legitimate factor” for an employer to consider); Age Discrimination in Employment: Hearings Before the General Subcommittee on Labor of the House Committee on Education and Labor, 90th Cong., 1st Sess. 14 (1967) (an “unavoidable” differential effect on an employer’s payroll is a legitimate factor to be considered under the legislation).

The floor debate was inconclusive. Several members of Congress expressed concern that employers were taking the higher cost of older labor into account, but in the context of remarks that the employers did not appreciate that older workers still did good work. What were these members getting at?: that it is forbidden to look at an employee’s salary, or that it is forbidden to judge an employee by his age rather than by his ability to perform the work? The latter theme predominates.

The structure of the Act accords with its history. Section 4(a) parallels Title VII in some respects but is different in others. One striking difference is § 4(f)(1), which says that “reasonable factors other than age” may be the basis of decision — implying strongly that the employer may use a ground of decision that is not age, even if it varies with age. What else could be the purpose of this language? Surely it does not mean simply that “only age discrimination is age discrimination.” “The prohibition and the exception appear identical. The sentence is incomprehensible unless the prohibition forbids disparate treatment and the exception authorizes disparate impact.” Douglas Laycock, Continuing Violations, Disparate Impact in Compensation, and Other Title VII Issues, 49 L. & Contemp.Prob. 53, 55 (Aut.1986) (referring to the identical structure of the Equal Pay Act of 1963). In Washington: County v. Gunther, 452 U.S. 161, 170-71, 101 S.Ct. 2242, 2248-49, 68 L.Ed.2d 751 (1981), the Court concluded that the “factor other than sex” language in the Equal Pay Act has independent significance. See also Los Angeles v. Manhart, 435 U.S. 702, 710-11 n. 20, 713 n. 24, 98 S.Ct. 1370, 1376-77 n. 20, 1377-78 n. 24, 55 L.Ed.2d 657 (1978), holding the “factor other than sex” exception to the Equal Pay Act precludes reliance on disparate impact analysis. Should not the parallel structure of the ADEA, enacted four years later, yield the same result?

There are other differences between Title VII and the ADEA. For example, § 4(f)(2) allows age to be used explicitly. Then there is § 4(f)(3), stating that an employer may discharge anyone for “cause”— another clause missing from Title VII. “Cause”, like “qualified”, is a continuous rather than dichotomous variable; not being productive enough to cover your wage is “cause”.

The language, structure, and history of the ADEA have led thoughtful people to conclude — with the district court in our case — that employers may consider wages in light of job performance, even that disparate impact analysis is inapplicable in ADEA cases. E.g., Albert Calille, Three Developing Issues of the Federal Age Discrimination in Employment Act of 1967, 54 Detroit J. Urban L. 431, 444 (1977); Donald R. Stacy, A Case against Extending the Adverse Impact Doctrine to ADEA, 10 Employee Relations L.J. 437 (1985); Note, The Cost Defense Under the Age Discrimination in Employment Act, 1982 Duke L.J. 580. Not to mention the views of Schlei and Grossman, the authors of the leading text on the law of employment discrimination, see Employment Discrimination Law 506 (2d ed. 1983), which directly support the district court’s decision. See also EEOC v. Wyoming, 460 U.S. 226, 233, 103 S.Ct. 1054, 1058, 75 L.Ed.2d 18 (1983) (describing § 4(f)(1) as permitting employers “to use neutral criteria not directly dependant [sic] on age”) (emphasis added); Markham v. Geller, 451 U.S. 945, 948, 101 S.Ct. 2028, 2030, 68 L.Ed.2d 332 (1981) (Rehnquist, J., dissenting from the denial of certiorari).

All of this does not deny the force of the position, expressed in Chrysler and similar cases, including Leftwich v. Harris-Stowe State College, 702 F.2d 686, 691 (8th Cir. 1983), that the ADEA forbids use of wage as a euphemism for age. In the words of EEOC v. Wyoming, when wage is “directly dependent on” age, the use of one is no better than the use of the other. Some colleges (and law firms) use lock-step compensation systems. The wage is a function of age and age only. For such an employer the statements “we are firing all professors with salaries above $35,000” and “we are firing all professors older than 65” are identical. Courts should treat them as identical. That is not remotely what Transit Mix did, however: it shut a poorly performing plant and fired its manager.

A growing literature on education, training, employment, and other aspects of human capital suggests that there may be times when employers will pay wages that do not represent the employees’ marginal products. For example, while receiving firm-specific training the employee may receive a wage exceeding his product; this is how the firm finances the training (for which the employee will hot pay, because it has no use outside the firm). Later the firm will recoup its investment by paying less than the marginal product. See Becker, Human Capital 26-37, 216-23. Other firms that give their employees access to trade secrets or put them in positions of trust may try to cement the employees’ loyalty (or honesty) with “golden handcuffs” — wages in excess of the employees’ marginal product, a form of special compensation the employee forfeits if he leaves the firm. E.g., Gary S. Becker & George J. Stigler, Law Enforcement, Malfeasance, and Compensation of Enforcers, 3 J.Legal Studies 1 (1974). Still other firms may pay employees slightly less than their marginal product early in their careers, knowing that as each employee’s productivity declines at the end of his career, the firm will be paying more than marginal product (thus paying the employee his due over the life cycle). This gives employees strong reasons to stick with their firms and be more productive throughout their careers, which in turn yields society the benefit of everyone’s abilities. Edward P. Lazear, Agency, Earnings Profiles, Productivity and Hours Restrictions, 71 Am.Econ.Rev. 606 (1981); Robert Hutchens, Delayed Payment Contracts and a Firm’s Propensity to Hire Older Workers, 4 J.Labor Econ. 439 (1986). But cf. Peter Kuhn, Wages, Effort, and Incentive Compatibility in Life-Cycle Employment Contracts, 4 J.Labor Econ. 28 (1986).

Whenever the age-wage profile of a class of employees includes a period of compensation at more than marginal product, the firm may be inclined to behave opportunistically — to fire the employee as soon as his current productivity no longer covers his current wage. A firm’s desire to attract new employees will curtail this opportunism, to the extent new hires learn of the firm’s reputation (or depend on a union to police the firm’s behavior). When the firm encounters economic trouble or for some other reason plans to shrink, it need not worry about scaring away bright new employees; it is out of that market. The distressed or shrinking firm may try to dispose of higher paid, older employees, cheating them out of the high compensation at the end of their careers. A disparate impact approach under the ADEA might help to curtail this opportunism. Whether it would do so well enough in light of the substantial error costs the inquiry would entail, I need not consider, for this approach would not assist Metz even if it were the law. Metz does not contend that Transit Mix was changing the structure of its compensation so as to exploit its older employees. And my colleagues apparently would allow Transit Mix to do so, if it wanted — it could reduce Metz’s salary, they say.

The court’s invitation to employers to cut wages the next time they are in a situation like the one Transit Mix encountered not only fails to protect older employees against the principal danger they face but also creates an anomaly. If it would have been legitimate to reduce Metz’s wage, why can he collect damages in this case? Presumably Transit Mix would be entitled to reduce Metz’s wage to what he could command in the market from another employer. Metz in fact took another job two months after Transit Mix put Burzloff in charge of the Knox plant. His new job, with the Starke County Highway Commission, pays about $12,500 per year. Metz wants the difference between $26,000 (his salary at Transit Mix) and $12,500; but if Metz was worth only $12,500 in the market arid Transit Mix could have cut his salary to that level, he should collect nothing.

We need not worry about damages, however. Metz had to show discriminatory intent and causation; he showed neither. He showed, at most, a decision by Transit Mix to disregard his age. That is the opposite of age discrimination. My colleagues avoid this only by merging disparate treatment and disparate impact analysis — which they ought not, because the premises of these approaches are different, because Feeney and Davis hold that they are different, because Manhart holds that “factor other than [age]” language forecloses disparate impact analysis, because this circuit already has held in Tice and Dorsch that employers may consider wages, and because the factual premises of the disparate impact approach do not obtain. My colleagues do not even cite these cases. If we are to buck both the language of the statute and the holdings of the Supreme Court, we ought to do so only when the facts are on our side. 
      
      . The most senior employee was Lawrence’s mother.
     
      
      . In response to questioning by the defendant’s attorney, Jere Humphrey, Lawrence testified at trial as follows:
      Q. Mr. Lawrence, you testified that in '83 that you had given Mr. Metz a raise of a thousand dollars, yet Mr. Metz testified that you had been losing money that year or, at least, sales were down. Why under those circumstances did you give Mr. Metz a $1,000.00 increase?
      A. Well, it was usual for me to give a raise every year. I gave the manager at the Plymouth plant, the general manager, a raise also. I gave Mr. Metz a raise also. I really can’t say that I pointed a finger of blame at Mr. Metz for business being as it was. But, that the area — the Knox area business was bad, it was very poor and it was getting worse and worse and worse. But, the previous year it was normal for me to give somebody — give both Jim and Wayne a raise each year, and I followed suit.
      Q. So, you have been doing it for a long time; you kept doing it?
      A. Yes, I have.
      Trial Transcript at 31 (June 23, 1986).
     
      
      . The statute protects employees who are at least forty years old. 29 U.S.C.A. § 631(a). Transit Mix qualifies as an “employer” within the meaning of the ADEA. 29 U.S.C. § 630(b).
     
      
      . The present claim is one of disparate treatment rather than disparate impact. The disparate impact mode of analysis, first applied in Title VII cases, permits a plaintiff to recover for "employment practices that are facially neutral in their treatment of different groups but that in fact fall more harshly on one group than another and cannot be justified by business necessity.” International Bhd. of Teamsters v. United States, 431 U.S. 324, 336 n. 15, 97 S.Ct. 1843, 1854-55 n. 15, 52 L.Ed.2d 396 (1977); see also Griggs v. Duke Power Co., 401 U.S. 424, 431, 91 S.Ct. 849, 853, 28 L.Ed.2d 158, 169 (1971) (finding as violative those employment practices that are "fair in form, but discriminatory in operation"). Unlike a disparate treatment claim, proof of motive is not required to sustain a claim of disparate impact. Teamsters, 431 U.S. at 335 n. 15, 97 S.Ct. at 1854-55 n. 15.
     
      
      . The district court found that the decision to terminate Metz was also motivated by the "greater flexibility afforded by Mr. Burzloff,” 646 F.Supp. at 289, who, in contrast to Metz, was able to return to the Plymouth plant provided the operations at the Knox satellite plant did not improve. The court characterized this consideration (as well as the salary/cost concern) as a "determining factor" in the firing decision. Id. at 290. The dissent argues that this finding is sufficient to support the court’s verdict independent of the issue presented by the defendant’s stated salary concerns. We disagree. The district court did not find that absent the desire to save the higher cost of Metz’s salary, Transit Mix nevertheless would have replaced Metz because of the flexibility motivation. The more reasonable interpretation is that the court found that both factors combined to provide a nondiscriminatory reason for the dismissal. That is, in the absence of salary concern, Metz would not have been replaced by Burzloff. For example, the court states that the salary issue "permeated the eventual decision” to replace Metz. Id. at 289. Indeed, this is the only interpretation which adequately explains the extended discussion of the salary issue by the district court.
     
      
      . As Willie Loman, of Arthur Miller’s Death of a Salesman, exclaimed to his boss upon being suddenly fired after thirty-four years of employment, "You can’t eat the orange and throw the peel away — a man is not a piece of fruit!” A. Miller, Death of a Salesman 82 (1949).
     
      
      . The treatise cites two district court cases for support, Donnelly v. Exxon Research & Eng’g Co., 12 Fair Empl.Prac.Cas. (BNA) 417 (D.N.J. 1974) [Available on WESTLAW, DCT database], aff'd mem., 521 F.2d 1398 (3d Cir.1975), and Mastie v. Great Lakes Steel Corp., 424 F.Supp. 1299 (E.D.Mich.1976).
      Although the court in Donnelly held that an employer may replace an older worker whose less-than-satisfactory services do not justify his salary, the court continued as follows:
      It would be unlawful and worse if an employer were to fire an older worker doing satisfactory work who, because of his seniority, received a certain salary because the employer wished to replace him with someone else who would do no better work but who, as a younger man with less seniority, would do the work for less.
      12 Fair Empl.Prac.Cas. at 421-22. Donnelly therefore does not support the distinction urged by Transit Mix; rather it supports a finding that Transit Mix violated the ADEA by replacing Metz, a satisfactory employee, with Burzloff, a younger man who would work for less.
      The court in Mastie did state that it "interprets the ADEA as permitting an employer to consider employment costs where such consideration is predicated upon an individual as opposed to a general assessment that the older worker’s cost of employment is greater than for other workers.” 424 F.Supp. at 1319. The court, however, acknowledged that this statement was dicta and "unnecessary in light of its other findings.” Id. One commentator has stated, “Mastie is both aberrational and, in any event, does not appear to be good law in light of the ruling in [EEOC v. Chrysler Corp., 733 F.2d 1183 (6th Cir.1984)].” 1 H. Eglit, Age Discrimination § 16.32, at 16-82.42 (1985). See discussion of Chrysler, infra p. 1208. For the reasons discussed in this opinion, we do not find the quoted statement in Mastie to be persuasive and therefore decline to follow it here.
     
      
      . In the context of reviewing claims requiring a showing of intentional discrimination, such as equal protection claims, courts have closely scrutinized the use of seemingly neutral criteria to justify practices which have a discriminatory effect. For example, a three-judge court in Rybicki v. State Bd. of Elections, 574 F.Supp. 1082, 1109-11 (N.D.Ill.1982), found that an Illinois redistricting plan violated the equal protection clause of the Fourteenth Amendment because it diluted minority voting strength. The court found the requisite intentional discrimination despite the offered "neutral” justification that the plan served to protect the ability of incumbents to be elected. The court stated: "[T]he requirements of incumbency are so closely intertwined with the need for racial dilution that an intent to maintain a safe, primarily white, district ... is virtually coterminous with a purpose to practice racial discrimination." id. at 1109; see also Ketchum v. Byrne, 740 F.2d 1398, 1406-10 (7th Cir.1984).
     
      
      . Our dissenting colleague is perhaps not acquainted with the old Army saying that, "There are two methods of promotion: seniority and favoritism.”
     
      
      . We assume (and we do not understand the dissent to disagree) that what aging middle managers would receive from their longtime employers is not necessarily what they could expect to command on the street.
     
      
      
        Dale and Sherkow did not arise under the ADEA, but like the majority I assume that the principles established by other antidiscrimination statutes apply to cases under the ADEA. On the role of causation, see also, e.g., Lewis v. University of Pittsburgh, 725 F.2d 910, 915-17 (3d Cir.1983), and Toney v. Block, 705 F.2d 1364 (D.C.Cir.1983) (Scalia, J.). There is a potentially difficult problem in allocating the risk of non-persuasion about causation. Compare Toney with Hopkins v. Price Waterhouse, 825 F.2d 458, 469-72 (D.C.Cir.1987), with id. at 464 (Williams, J., dissenting). But the parties have not briefed this question, and so I do not consider it further.
     