
    Daniel G. DEMER, Plaintiff-Appellant, v. IBM CORPORATION LTD PLAN; Metropolitan Life Insurance Company, Defendants-Appellees.
    No. 13-17196
    United States Court of Appeals, Ninth Circuit.
    Argued and Submitted December 11, 2015, San Francisco, California
    Filed August 26, 2016
    
      Michelle L. Roberts (argued), Roberts Bartolie LLP, Alameda, California; Barry Kirschner, Waterfall, Economidis, Caldwell, Hanshaw & Villamana, P.C., Tucson, Arizona; for Plaintiff-Appellant.
    Michelle McAloon Constandse (argued), Metropolitan Life Insurance Company, Irvine, California; James K. Maekie, Ogle-tree, Deakins, Nash, Smoak & Stewart, P.C., Tucson, Arizona, for Defendants-Ap-pellees.
    Before: JAY S. BYBEE and MORGAN CHRISTEN, Circuit Judges, and EDWARD M. CHEN, District Judge.
    
    
      
       The Honorable Edward M. Chen, United States District Judge for the Northern District of California, sitting by designation.
    
   Partial Concurrence and Partial Dissent by Judge BYBEE

OPINION

CHEN, District Judge:

Plaintiff-Appellant Daniel G. Demer filed suit, pursuant to the Employee Retirement Income Security Act of 1974 (“ERISA”), against Defendants-Appellees IBM Corporation LTD Plan (the “Plan”) and Metropolitan Life Insurance Company (“MetLife”). Mr. Demer claimed that Met-Life, the claim administrator and insurer for the Plan, improperly denied his claim for long-term disability (“LTD”) benefits. See 29 U.S.C. § 1132(a)(1)(B) (providing that “[a] civil action may be brought ... by a participant or beneficiary ... to recover benefits due to him under the terms of his plan, to enforce his rights under the terms of the plan, or to clarify his rights to future benefits under the terms of the plan”). The district court denied Mr. Dem-er’s motion for summary judgment, granted Defendants’ cross-motion, and entered judgment in favor of Defendants.

We reverse the district court’s entry of judgment in Defendants’ favor and remand to the district court with instructions to remand this ease to MetLife to re-evaluate the merits of Mr. Demer’s LTD claim.

I.

A. Mr. Demer’s Claim for LTD Benefits

Mr. Demer was an employee of IBM Corporation and a participant in the Plan. MetLife is the claim administrator for and insurer of the Plan. The parties agree that the Plan gives MetLife, as the administrator, discretionary authority to interpret the Plan and determine benefits eligibility. Where, as here, an ERISA plan confers discretionary authority on the plan administrator as a matter of contractual agreement, then the standard of review is abuse of discretion rather than de novo. See Tapley v. Locals 302 & 612 of the Int’l Union of Operating Eng’rs-Employers Constr. Indus. Ret. Plan, 728 F.3d 1134, 1139 (9th Cir. 2013) (“Where an ERISA Plan grants ‘discretionary authority to determine eligibility for benefits or to construe the terms of the plan,’ ‘a plan administrator’s interpretation of a plan’ is reviewed for abuse of discretion. We review the district court’s application of this standard de novo.”); Abatie v. Alta Health & Life Ins. Co., 458 F.3d 955, 963 (9th Cir. 2006) (en banc) (“[I]f the plan does confer discretionary authority as a matter of contractual agreement, then the standard of review shifts [from de novo] to abuse of discretion.”) (emphasis omitted).

• The Plan provides that a participant is disabled and eligible for LTD benefits if,

during the elimination period and the first 12 months after you complete the elimination period, you cannot perform the important duties of your regular job [ie., your own occupation] with IBM because of a sickness or injury. After expiration of that 12 month period, disabled means that, because of a sickness or injury, you cannot perform the important duties of any other gainful occupation for which you are reasonably fit by your education, training or experience.

“[G]ainful occupation” means “occupations [for which] you are reasonably qualified based on your education, training, experience, and functional ability” and further, in Mr. Demer’s case, “provides gainful wages of $4,240.48 per month or $24.46 hourly,” ie., the equivalent of a yearly salary of approximately $50,000.

Mr. Demer stopped working at IBM on January 9, 2009, because of a disability. At the time, he was a Lead Internal Auditor at IBM. He began receiving short term disability (“STD”) benefits. In March 2009, he filed a claim for LTD benefits pursuant to the Plan (because his STD benefits were due to expire soon). In his application for LTD benefits, Mr. Demer stated: “I am unable to do my job duties due to severe recurrent depression and spinal stenosis, chronic headaches.” Symptoms included “chronic headaches, chronic back and neck pain, myalgia, severe depression, [and] sciatica.”

On July 28, 2009, MetLife approved Mr. Demer’s claim for LTD benefits under the “own occupation” test for disability articulated in the Plan. MetLife noted that the test for disability would eventually switch to the “any occupation” test on July 11, 2010. MetLife further noted that it was limiting Mr. Demer’s benefits to a period of twenty-four months because his primary diagnosis was a mental or nervous disorder.

Subsequently, in November 2009, Met-Life sent a letter to Mr. Demer, reminding him that, for his benefits to continue (beyond July 11, 2010), he would have to be disabled under the “any occupation” test for disability.

Mr. Demer thereafter submitted statements and’ medical records from numerous treating physicians, including but not limited to his primary care doctor, Dr. Stephen Moore; a treating neurologist, Dr. David Weidman; and a treating pain management physician, Dr. Robert Osborne. These doctors discussed not only mental impairments suffered by Mr. Demer but also physical impairments. For example:

• In a statement from February 2010, Dr. Weidman referred to “chronic os-teoarthritic pain and depression inter-netting] with each other.” Dr. Weid-man also indicated that Mr. Demer’s physical condition had deteriorated since April 2009 — e.g., in April 2009, Dr. Weidman had concluded that Mr. Demer could intermittently sit for 4-5 hours, intermittently stand for 4-5 hours, and occasionally lift 11-20 pounds; but, in February 2010, Dr. Weidman determined that Mr. Demer could only intermittently stand for 1-2 hours and never lift 11-20 pounds.
• In a medical record dated February 2010, Dr. Osborne referred to severe cervical and lumbosacral spine disease with radiculopathy and chronic depression. Notably, Dr. Osborne found that Mr. Demer had severe limitations as a result of his physical impairments— e.g., able to intermittently sit for 1 hour, intermittently stand for 0-1 hour, and intermittently walk for 0-1 hour and never able to lift up to 10 pounds.
• Dr. Moore, Mr. Demer’s primary care physician, had a similar, albeit slightly more positive, view with respect to Mr. Demer’s physical limitations, opining, e.g., that Mr. Demer could continuously sit for 1 hour and continuously stand and walk for 0-1 hour and could occasionally lift 21-50 pounds.

On October 1, 2010, MetLife denied Mr. Demer’s claim for LTD benefits under the “any occupation” test for disability. In its denial, MetLife relied in large part on the opinion of an independent physician consultant (“IPC”), Dr. Elyssa Del Valle, internal medicine, who conducted only a paper review of Mr. Demer’s file — ie., she did not personally perform a physical or mental examination of Mr. Demer. Dr. Del Valle concluded that “[t]he medical information does support functional limitations ... due to severe degenerative disc disease, degenerative vertebral disease with numerous levels of the cervical, thoracic and lumbar spine associated with neural foraminal narrowing as well as spinal sten-osis.” She also stated that “[t]he condition is associated with chronic pain necessitating narcotic analgesics despite trigger point injections, cervical and lumbar epidural injections and physical therapy.” But Dr. Del Valle disagreed with the physical capacity assessments of Dr. Moore and Dr. Osborne because they “would indicate that [Mr. Demer] is bedridden for more than 20 hours a day.” Dr. Del Valle also indicated that she agreed with an older assessment made by Dr. Weidman (from April 2009), noting that, although it was more than a year old, “there are no clinical data/findings to indicate any change in his overall condition” (opining, inter alia, that Mr. Demer could walk 3^4 hours intermittently and that he “should avoid any prolonged periods of sitting, standing or walking more than 30 minutes”). In its decision, MetLife determined that, even with the limitations identified by Dr. Del Valle, Mr. Demer “should "Joe able to perform at the sedentary to light level of physical exertion as defined by the U.S. Department of Labor” and therefore denied Mr. Demer LTD benefits.

B. The Appeal

In March 2011, Mr. Demer appealed MetLife’s denial of LTD benefits. In his appeal, Mr. Demer asserted that he “has severe degenerative disc disease (‘DDD’) of the cervical and lumbar spine,” for which there was “further progression [as] reflected in the cervical MRI performed June 21, 2010.” He also claimed that he “suffers radiculopathy,” “has a history of significant headaches,” and has “ongoing nerve compression.” Finally, he pointed out that he “takes powerful narcotic and other medications” which -“have known side effects causing fatigue and reduced ability to concentrate.” As noted above, MetLife’s own IPC, Dr. Del Valle, acknowledged that Mr. Demer had chronic pain that necessitated narcotic analgesics.

In support of his appeal, Mr. Demer provided, e.g., additional information from Dr. Osborne. Dr. Osborne stated, inter alia, that

the overall picture is one of a gentleman with severe spinal deterioration at all components of the spine as well as neu-rophysiological evidence of a delayed conduction (spine cord problem) of the bilateral Posterior Tibial Nerves to the cerebral cortex as well as a separate focal left L5 nerve root lesion (diagnostic SSEP and diagnostic L5 radiculopathy).

Dr. Osborne further stated that “[t]he overall treatment plan has included chronic narcotic medication in attempt to control his overall pain” which has side effects that “limit the ability to complete productive mental functions.” Mr. Demer also provided third-party witness statements from his brother (Frank Demer) and a friend (Shirley Piel) and a personal statement in support of his appeal. Both Ms. Piel’s statement and Mr. Demer’s personal statement addressed, inter alia, the impact Mr. Demer’s medications had on his mental ability to function.

MetLife denied the appeal, this time relying on the opinions of two different IPCs, namely, Dr. Marcus Goldman, Board Certified in psychiatry, and Dr. Dennis S. Gordan, Board Certified in physical medicine and rehabilitation. Like Dr. Del Valle, Dr. Goldman and Dr. Gordan conducted only paper reviews of Mr. Demer’s file without any personal examination.

With regard to mental functional limitations, Dr. Goldman stated that, “[g]iven the lack of recent data and the paucity of any compelling objective findings, as well as the lack of serial mental status examinations, this reviewer would be unable to establish the presence of an impairing mental condition.”

With regard to physical functional limitations, Dr. Gordan acknowledged that there was “documented anatomical cervical spinal stenosis, degenerative disc disease, and degenerative facet disease of the spine, as well as degenerative arthritis of the left hip.” He disagreed, however, that Mr. Demer suffered from a radiculopathy based on his interpretation of the medical evidence. Dr. Gordan also indicated that Dr. Osborne’s impressions may have been colored by Mr. Demer’s “dire” account of his history, “a reversal of his prior positive attitude ... about the effectiveness of the previous interventional procedures and medications.” In addition, Dr. Gordan relayed a conversation he had with Dr. Moore (Mr. Demer’s primary care physician) in which Dr. Moore said “he thought it was likely that [Mr. Demer] could do a very sedentary job, but ... felt that he would have to see him again to say that definitively.” Dr. Gordan ultimately concluded that Mr. Demer “likely had a modicum of discomfort” from, inter alia, “neck and back pain related to spinal degeneration, and referred pain down the limbs from those degenerative changes,” but Mr. Demer retained the physical functional capacity to, e.g., “sit[ ] for an hour at a time ... and up to 7 hours a day, stand[ ] and walk[ ] for 15 minutes at a time and up to 2 hours a day, lift[] up to 10 pounds frequently, 20 pounds occasionally.”

In addition to the above, both Dr. Goldman and Dr. Gordan addressed the specific issue raised by Mr. Demer in his appeal that the medications prescribed for his physical condition affected his ability to mentally function. According to Dr. Goldman, “there clearly are no objective or other compelling or convincing data to establish functional impairment as a result of Mr. Demer’s psychotropic medications.” Dr. Gordan stated: “There is no specific information about medications taken or effects from them during the period in question. Although Dr. Osborne asserted that the claimant’s needed narcotic medication caused cognitive side effects, there was never any evidence of that.”

In denying Mr. Demer’s appeal, MetLife appears to have accepted Dr. Gordan’s physical capacity assessment. MetLife also appears not to have placed any mental limitations on Mr. Demer as a result of his medications. Based on the physical capacity assessment and lack of any cognitive limitation, and an occupation assessment conducted by a vocational rehabilitation consultant based thereon, MetLife concluded that Mr. Demer could work in certain sedentary occupations, such as Project Director and Computer Security Coordinator.

C. District Court Proceedings

Following MetLife’s denial of LTD benefits, Mr. Demer initiated this lawsuit. In reviewing MetLife’s denial of benefits, the district court applied the abuse-of-discretion standard and rejected Mr. Demer’s contention that the abuse-of-discretion review must be tempered with skepticism because of a conflict of interest on the part of MetLife. See Demer v. IBM Corp., 975 F.Supp.2d 1059, 1076-77 (D. Ariz. 2018). The district court found that

the record taken as a whole establishes that MetLife reasonably relied on its IPCs’ reports. Every doctor agreed that Plaintiff suffered from a combination of depression and chronic pain syndrome, but every doctor also had a different opinion as to Plaintiffs future functionality. MetLife was required to choose between divergent opinions. MetLife’s decision to rely on its IPCs’ findings was reasonable.

Id. at 1083.

II.

We first address whether MetLife had a conflict of interest such that our review should be tempered by skepticism. See Harlick v. Blue Shield of California, 686 F.3d 699, 707 (9th Cir. 2012). A conflict of interest is a factor in the abuse-of-discretion review, the weight of which depends on the severity of the conflict. See id.; see also Renfro v. Funky Door Long Term Disability Plan, 686 F.3d 1044, 1048 (9th Cir. 2012) (noting that, “if the plan gives discretion, but the administrator operates under a conflict of interest, then ‘the conflict of interest must be weighed as a factor in determining whether there is an abuse of discretion’”) (quoting Met. Life Ins. Co. v. Glenn, 554 U.S. 105, 110-11, 128 S.Ct. 2343 (2008)); Stephan v. Unum Life Ins. Co. of America, 697 F.3d 917, 929 (9th Cir. 2011) (noting that degree of skepticism in determining whether administrator abused its discretion varies based on extent of conflict of interest); Montour v. Hartford Life & Acc. Ins. Co., 588 F.3d 623, 630-31 (9th Cir. 2009) (stating that the extent of a conflict of interest affects its weight in the overall analysis of whether an abuse of discretion occurred).

In the instant case, the evidence of a conflict of interest on which Mr. Demer relies consists of the following: (1) MetLife is both the claim administrator for the Plan and its insurer and (2) at least two of the IPCs that MetLife hired to review the medical record (Dr. Del Valle and Dr. Gor-dan) have performed a significant number of reviews for MetLife and have received significant compensation for their services.

‘“We review de novo a district court’s choice and application of the standard of review to decisions by fiduciaries in ERISA cases.’ ” Prichard v. Metro. Life Ins. Co., 783 F.3d 1166, 1168 (9th Cir. 2015).

A. Structural Conflict of Interest

In its opinion, the district court acknowledged that MetLife has a structural conflict of interest because MetLife both evaluates claims made against the Plan and funds claims. See Montour, 588 F.3d at 630 (noting that, when “the same entity that funds an ERISA benefits plan also evaluates claims, ... the plan administrator faces a structural conflict of interest: since it is also the insurer, benefits are paid out of the administrator’s own pocket, so by denying benefits, the administrator retains money for itself’). However, the district court applied no skepticism as a result of the structural conflict because “MetLife has taken affirmative steps to reduce potential bias and promote accurate claim determinations.” Demer, 975 F.Supp.2d at 1076; see also MetLife, 554 U.S. at 117, 128 S.Ct. 2343 (noting that a conflict of interest “should prove less important (perhaps to the vanishing point) where the administrator has taken active steps to reduce potential bias and to promote accuracy, for example, by walling off claims administrators from those interested in firm finances, or by imposing management checks that penalize inaccurate decisionmaking irrespective of whom the inaccuracy benefits”).

Mr. Demer objects to the district court’s reliance on the declarations from two employees, Gregory Hafner and Laura Sullivan, who describe the affirmative steps taken by MetLife to reduce its structural conflict, on the ground that neither Mr. Hafner nor Ms. Sullivan was disclosed as a witness in MetLife’s initial disclosures as required by Federal Rule of Civil Procedure 26. MetLife did not explain its failure to identify witnesses in its mandatory initial disclosures; on the other hand, Mr. Demer did not explain his failure to take a 30(b)(6) deposition on the structural conflict issue. See James v. AT&T West Disability Benefits Program, 41 F.Supp.3d 849, 871 (N.D. Cal. 2014) (finding defendant’s failure to disclose conflict-of-interest declarations harmless because “plaintiff had ample time to seek discovery, but did not do so — [thus] she cannot credibly claim prejudice.”).

We need not resolve this issue because, even assuming there is no residual structural conflict (ie., because of affirmative steps taken by MetLife to insulate its claims department), some skepticism is warranted here because of the financial conflict of the IPCs upon whom Met Life relied.

B. Financial Conflict of Independent Physician Consultants

Mr. Demer claims MetLife operated under a conflict of interest because two of the IPCs that MetLife hired to review the medical record, Dr. Del Valle and Dr. Gordan, have done a substantial number of reviews for MetLife and received significant compensation from MetLife for their services. For 2009 and 2010, Dr. Del Valle performed more than 250 reviews/adden-dums each year and earned more than $125,000 each year; for the same time period, Dr. Gordan performed between 200-300 reviews/addendums each year and earned more than $175,000 each year. Based on the number of reviews and the amount of compensation, Mr. Demer asserts that the opinions of Dr. Del Valle and Dr. Gordan should be questioned because the doctors had financial incentives to render opinions favorable to MetLife. Mr. Demer further argues that, because Met-Life relied on the doctors’ opinions in denying him relief, the doctors’ conflict is, in effect, imparted to MetLife.

As a preliminary matter, we note that Mr. Demer’s argument here is comparable to conventional approaches to discrediting the testimony of retained experts whose objectivity may be challenged based on, e.g., the number of times he or she has served as an expert in support of a party and the amount of compensation received. This alleged conflict of interest is distinct from the purported structural conflict of interest discussed above. The lack of any structural conflict of interest on the part of MetLife does not preclude MetLife from having a conflict -of interest based on an IPC’s financial interests; the factors that raise the possibility of a structural conflict relate to the incentives applicable to Met-Life’s claims department, whereas the factors that raise the possibility of a financial conflict relate to the incentives applicable to MetLife’s retained, experts. Even if Met-Life operated with no structural conflict, reliance on the reports of its retained experts who have a financial incentive to make findings favorable to MetLife may-warrant skepticism.

We further take note that it is Mr. Demer’s burden, as the party claiming a conflict, to produce evidence of a financial conflict sufficient to warrant a degree of skepticism. Placing the burden on Mr. Demer, as an initial matter, makes sense given that he is asking for a departure from the otherwise applicable standard of review for abuse of discretion. Once such evidence is produced, however, the burden then shifts to MetLife to produce evidence that there is no conflict. Cf. Muniz v. Amec Constr. Mgmt., 623 F.3d 1290, 1295 (9th Cir. 2010) (in discussing a structural conflict of interest, stating, “when a claimant produces evidence that a plan administrator’s self-interest caused a breach of the administrator’s fiduciary obligations to the claimant, a rebuttable presumption arises in favor of the claimant and the plan bears the burden of proving that a conflict of interest did not affect its decision to deny or terminate benefits”); see also Estate of Barton v. ADT Sec. Servs. Pension Plan, 820 F.3d 1060, 1065-66 (9th Cir. 2016) (indicating that a plaintiff fairly bears the burden of proving entitlement to ERISA benefits where he or she has better or at least equal access to the evidence needed to prove entitlement; in certain cases, however, the defending entity solely controls the information that determines entitlement).

We conclude that Mr. Demer has satisfied his burden of production. Mr. Demer has offered evidence that the IPCs have earned a substantial amount of money from MetLife ($125,000-$175,000 each year) and have performed a substantial number of reviews for the company as well (200-300 reviews/addendums each year). The magnitudes of these numbers, particularly when combined, raise a fair inference that there is a financial conflict which influenced the IPCs’ assessments, and thus such conflict should be considered as a factor in reviewing MetLife’s decision for abuse of discretion. See Montour, 588 F.3d at 634 (“how frequently [the insurance company] contracts with the file reviewers it employed in this case” is relevant to ascertaining conflict); Nolan v. Heald College, 551 F.3d 1148, 1152 & n.3 (9th Cir. 2009) (evidence that the outside medical reviewers “received substantial work and monies from MetLife in the three-to-four years preceding and including [the claimant’s] benefits denial” could be a factor tempering abuse of discretion review). Here, the evidence of the IPCs’ financial conflict of interest, in the absence of contrary evidence from MetLife, warrants some skepticism in reviewing MetLife’s decision.

To be sure, the lack of more powerful evidence that, e.g., the IPCs had “ ‘some specific stake in the outcome’ ” of Mr. Demer’s case, McDonald v. Hartford Life Group Ins. Co., 361 Fed.Appx. 599, 610 (5th Cir. 2010), or of statistics showing a parsimonious pattern of assessments disfa-vorable to claimants, see Montour, 588 F.3d at 634, minimizes the “weight [assigned] to the conflict of interest as a factor in the overall analysis of whether an abuse of discretion occurred.” Id. at 631. But that lack of such specific evidence does not mean that there is no conflict of interest. Here, we have evidence of not only the frequency of reviews for MetLife but also the significant dollar amounts earned by the reviewers.

Furthermore, that Mr. Demer could have, but did not, develop a stronger record of the IPCs’ conflict of interest does not mean that there is no conflict. Because Mr. Demer did provide evidence of a financial conflict warranting an inference of bias, the burden shifted to MetLife to counter that evidence. As we noted in Montour, both the plaintiff and the administrator ran a risk in not developing evidence of bias or lack thereof. See id. at 634 (before addressing plaintiffs failure to submit extrinsic evidence of bias such as statistics of rate of claims denied or frequency of file reviews, court took note of administrator’s “failure to present extrinsic evidence of any effort on its part to ‘assure accurate claims assessment’ ”). Here, MetLife could have maintained records of its reviewers’ findings on claims to show their neutrality in practice, but it did not. While MetLife therefore missed an opportunity to negate any inference of a financial conflict of interest, Mr. Demer failed as well to develop more powerful evidence that could have established enhanced skepticism in reviewing MetLife’s decision. Thus, we find there is neither a lack of conflict of interest (justifying no skepticism) nor a substantial conflict of interest (warranting enhanced skepticism). Instead, the financial conflict — modest but extant — warrants some, but not substantial, weight under Abatie and Montour.

The dissent argues that MetLife “listened very carefully” to our instruction in Abatie that plan administrators may reduce conflicts by “referring] medical evaluations to outside experts, such as doctors, who also have no interest in firm finances,” and that for its trouble we give MetLife additional scrutiny. But the dissent fails to acknowledge that Abatie considered an administrator’s use of “truly independent medical examiners or a neutral, independent review process.” Abatie, 458 F.3d at 969 & n.7 (emphasis added). The dissent mistakenly equates outside experts with independent experts, but the former does not guarantee the latter. We do not quarrel with the notion that using outside medical evaluators can be an important step toward the goal of obtaining neutral assessments, but it is not hard to imagine an outside medical examiner who does not engage in a neutral, independent review, such as where the examiner receives hundreds of thousands of dollars from a single source and performs hundreds of reviews for that source every year.

Despite the dissent’s suggestion that the majority disapproves of outside reviewers, we imply no such disapproval; we simply apply the unremarkable proposition that the number of examinations referred and the size of the professional fees paid to a reviewer may compromise the neutrality of an expert. See Montour, 588 F.3d at 634; Nolan, 551 F.3d at 1152, n.3. The extra-circuit decisions the dissent cites do not stand for the proposition that outside experts are immune from judicial scrutiny for possible bias. While the formulation in determining whether a financial conflict of interest exists may be stated in various ways, we think it is clear under the facts in this case where MetLife paid substantial monies for a high volume of repeat work to the IPCs involved and there is no evidence rebutting an inference of bias, there is sufficient evidence of a financial conflict to temper abuse of discretion review.

C. Evidence of Mental Limitations

Having concluded that the abuse-of-discretion review should be tempered with some skepticism, we now turn to Mr. Demer’s contention that MetLife abused its discretion in denying his claim for benefits because it did not find his mental functional capacity was affected in any way by the medications he was taking for his physical pain. As indicated above, MetLife did not ask its vocational rehabilitation consultant to consider any limitation on Mr. Demer’s mental ability to function. We conclude that MetLife abused its discretion in denying Mr. Demer’s claim.

In reaching this conclusion, we first take note of three points that are essentially undisputed:

(1) Mr. Demer “takes powerful narcotic and other medications, prescribed in attempts to manage his pain.” These medications included morphine.

(2) These medications were medically necessary to address Mr. Demer’s pain arising from physical impairments. (MetLife’s IPC Dr. Del Valle noting that Mr. Demer’s physical problems “necessitate] narcotic analgesics”).

(3) The “prescribed narcotic and neurological oriented medications have known side effects” on an individual’s mental functioning. (Mr. Dem-er’s treating physician Dr. Osborne stating that “[t]he side effects with dosing of narcotics, limit the ability to complete productive mental functions!;] [t]hey are to be expected and are limits of the only treatment available for this gentleman”).

Moreover, in a personal statement, Mr. Demer claimed that he did, in fact, suffer side effects as a result of his medications, including fatigue and difficulty with concentration (e.g., the medications “cause me to fatigue and, and they help confuse me in my thinking and ability to communicate”; “I can no longer read complex materials because I cannot concentrate to comprehend them”; and “I also have memory lapses after having read the pages I may still be looking at”).

Mr. Demer corroborated his claim with a statement from a friend, Ms. Piel (e.g., she “know[s] [Mr.] Demer”; “[t]here has been a sharp decline in his well being during the past ten years”; she has viewed his physical pain; Mr. Demer has side effects from the prescribed medication which makes him “consistently appear[ ] to be in a haze, unable to cope with what were once routine matters”; and Mr. Demer “has repeatedly demonstrated his inability to safely drive because of the inability to focus”).

He also pointed to supporting contemporaneous evidence from his treating physicians. For example, Dr. Osborne expressed agreement that his physical examinations and medical records indicated that Mr. Demer was suffering from side effects of his medications, “which infringe on [his] ability to concentrate and tend to diminish [his] energy.” Dr. Moore commented that Mr. Demer “has cognitive limitations [secondary to] pain as well as analgesics.”

Despite this evidence, MetLife rejected any mental limitations based on the opinions of two IPCs, Drs. Goldman and Dr. Gordan, neither of whom actually examined Mr. Demer. (Dr. Goldman stated that, “[b]eyond October 29, 2010, there clearly are no objective or other compelling or convincing data to establish functional impairment as a result of Mr. Demer’s psychotropic medications.” Dr. Gordan stating that “[t]here is no specific information about medications taken or effects from them during the period in question!;] [although Dr. Osborne asserted that the claimant’s needed narcotic medication caused cognitive side effects, there was never any evidence of that.”) Implicit in each doctor’s opinion — and therefore Met-Life’s decision — was a conclusion that Mr. Demer’s complaints of fatigue and difficulty concentrating were not credible.

But the IPCs had little basis for rejecting Mr. Demer’s credibility. In addition to the fact that the IPCs never examined Mr. Demer, they never explained specifically why they rejected Mr. Demer’s claim of mental function limitations when (1) he was taking what are undisputedly powerful narcotic medications and (2) his subjective complaints were corroborated by his treating physicians as well as a friend (Ms. Piel). See Godmar v. Hewlett-Packard Co., 631 Fed.Appx. 397, 406 (6th Cir. 2015) (stating that “there is ‘nothing inherently objectionable about a [paper] review,’ ” but such “reviews are particularly troubling when the administrator’s consulting physicians — who have never met the claimant— discount the claimant’s limitations as subjective or exaggerated”; adding that “ ‘we will not credit a file review to the extent that it relies on adverse credibility findings when the files do not state that there is reason to doubt the applicant’s credibility’ ”); Montour, 588 F.3d at 634-35 (indicating that a plan should not require a claimant to provide objective proof of his pain level and that a plan should not reject subjective claims of excess pain based solely on a paper review’s observation that a physical impairment should not cause the claimant as much pain as he was reportedly suffering); cf. Rollins v. Massanuri, 261 F.3d 853, 857 (9th Cir. 2001) (in the Social Security context, noting that “subjective pain testimony cannot be rejected on the sole ground that it is not fully corroborated by objective medical evidence”).

We acknowledge that the district court’s order suggests possible grounds for questioning Mr. Demer’s credibility' — i.e., that his activities of daily living indicated some ability to engage in mental functioning. See Demer, 975 F.Supp.2d at 1081 (stating that “Dr. Osborne’s opinion that Plaintiff could not operate a vehicle was directly contradicted by Plaintiffs conversations with MetLife on January 14, 2010 and May 18, 2010, where he stated that he had been driving a vehicle[;] [fjurther, while receiving disability payments, Demer told a Met-Life claims representative that ‘he was just completing online courses’ ”); (Met-Life’s electronic diary notes). But neither MetLife nor its IPCs rejected Mr. Dem-er’s credibility on this basis. See Harlick, 686 F.3d at 719-20 (stating that “[t]he general rule ... is that a court will not allow an ERISA plan administrator to assert a reason for denial of benefits that it had not given during the administrative process”). Moreover, it is not clear that these activities of daily living necessarily establish an ability to work within the meaning of the Plan. Notably, under the terms of the Plan, Mr. Demer is eligible for LTD benefits if he cannot engage in a “gainful occupation,” which in Mr. Demer’s case is a job that has a yearly salary of approximately $50,000. A job that commands such a salary may well require higher levels of mental functioning, including concentration and memory, both of which are areas where Mr. Demer has claimed impairment as a result of his medications.

D. Evidence of Physical Limitations

There is an additional factor weighing in favor of finding an abuse of discretion by MetLife. In denying Mr. Demer’s appeal, MetLife effectively adopted the physical functional capacity assessed by Dr. Gor-dan — ie.,

that Mr. Demer would be capable of sitting for an hour at a time, with short breaks for stretching, up to seven hours a day; standing and walking for 15 minutes at a time and up to two hours a day; lifting up to 10 pounds frequently, 20 pounds occasionally .and 35 pounds rarely; occasionally twisting, bending, stooping, and reaching above shoulder level, driving, and doing repetitive movements with either hand and occasionally climbing stairs.

Similar to above, Dr. Gordan was implicitly rejecting Mr. Demer’s credibility based solely on a paper review without having physically examined him and without explaining why Mr. Demer’s credibility was lacking, particularly, in light of some medical records conflicting with Dr. Gor-dan’s physical functional capacity assessment. Most notably, Dr. Gordan’s assessment conflicted with the more restrictive assessment adopted by MetLife’s other IPC, Dr. Del Valle, which MetLife had previously adopted in initially denying Mr. Demer benefits. For instance, with respect to lifting capacity, Dr. Gordan found that Mr. Demer could lift up to 10 pounds frequently, but previously, MetLife found (as part of its initial denial) that Mr. Dem-er could not frequently lift more than 10 pounds. Also, whereas Dr. Gordan found that Mr. Demer could sit with breaks up to seven hours a day, MetLife previously found (based on Dr. Del Valle’s initial assessment) that he could only sit “4-6 hours per 8 hour work day with proper ergonomics and the ability to change position as needed.” MetLife never explained why it concluded that Dr. Gordan’s assessment was more appropriate over Dr. Del Valle’s earlier assessment, particularly since the record indicated that Mr. Demer’s condition did not improved (and may have deteriorated) over time.

E. Conclusion

Taking into account the totality of the circumstances — i.e., the financial conflict of interest of the IPCs on whom MetLife relied (which warrants some skepticism in reviewing the IPCs’ conclusions), the substantial evidence of Mr. Demer’s mental limitations due to pain medication and physical limitations, and the IPCs’ reviews of Mr. Demer’s condition, without having examined him and without explaining why they rejected his credibility, particularly in light of evidence corroborating his credibility (both medical and nonmedical) — Met-Life abused its discretion in denying Mr. Demer’s claim for LTD benefits.

F. Remedy

The question remaining is what remedy should issue. See Cook v. Liberty Life Assurance Co. of Boston, 320 F.3d 11, 24 (1st Cir. 2003) (“Once a court finds that an administrator has acted arbitrarily and capriciously in denying a claim for benefits, the court can either remand the case to the administrator for a renewed evaluation of the claimant’s case, or it can award a retroactive reinstatement of benefits.”). We hold that a remand to the district court, with instructions to remand to Met-Life, is appropriate. An award of benefits is not a proper remedy because the record does not clearly establish that MetLife should necessarily have awarded Mr. Dem-er benefits. Cf. Grosz-Salomon v. Paul Revere Life Ins. Co., 237 F.3d 1154, 1163 (9th Cir. 2001) (“[Retroactive reinstatement of benefits is appropriate in ERISA cases where ... ‘but for [the insurer’s] arbitrary and capricious conduct, [the insured] would have continued to receive the benefits’ or where ‘there [was] no evidence in the record to support a termination or denial of benefits.’ ”).

To be clear, on remand, MetLife may reopen the record to consider additional evidence regarding mental limitations. The record as it stands does not show precisely what Mr. Demer’s limitations were as a result of the medications. While a retrospective evaluation may be difficult given the passage of time, a retrospective evaluation of Mr. Demer’s limitations is not necessarily impossible. Indeed, in the Social Security context, retrospective evaluations are not uncommon. Historical records, data and trends may be relevant and useful in rendering a retrospective evaluation. See, e.g., Smith v. Bowen, 849 F.2d 1222, 1225 (9th Cir. 1988) (in Social Security case, stating that “reports containing observations made after the period for disability are relevant to assess the claimant’s disability!;] [i]t is obvious that medical reports are inevitably rendered retrospectively and should not be disregarded solely on that basis”). Furthermore, a current evaluation of Mr. Demer may be particularly useful because his benefit period may have extended beyond the date of the appeal, see 2ER 130, 217 (addressing Maximum Benefit Period), such that a current examination may be closer in time to the assessment period than it would otherwise appear.

III.

Accordingly, we REVERSE and REMAND with instructions to the district court to remand this case to MetLife so that it may re-evaluate the merits of Mr. Demer’s LTD claim.

BYBEE, Circuit Judge,

dissenting from Part II.B, but concurring in the judgment:

An ERISA plan administrator has a structural conflict of interest where it “both funds the plan and evaluates the claims.” Metropolitan Life Ins. Co. v. Glenn, 554 U.S. 105, 112, 128 S.Ct. 2343, 171 L.Ed.2d 299 (2008); see also Abatie v. Alta Health & Life Ins. Co., 458 F.3d 955, 965 (9th Cir. 2006) (en banc). The federal courts have offered at least two ways that such conflicts “should prove less important (perhaps to the vanishing point).” Metro. Life, 554 U.S. at 116, 128 S.Ct. 2343. First, administrators may “wall[] off claims administrators from those interested in firm finances.” Id.; Abatie, 458 F.3d at 969 & n.7 (an administrator may show that “any conflict did not influence its decisionmak-ing process” by showing that “its employees do not have incentives to deny claims”); Davis v. Unum Life Ins. Co. of Am., 444 F.3d 569, 575-76 (7th Cir. 2006) (holding that absent evidence of “any specific incentive [for the in-house doctors] to derail [a] claim,” such as giving the doctors “some specific stake in the outcome of [a] case,” the theoretical argument that “in-house doctors have an inherent conflict in every case” is insufficient to change the standard of review). Second, plan administrators may refer medical evaluations to outside experts, such as doctors, who also have no interest in firm finances. Abatie, 458 F.3d at 969 & n.7 (“[T]he administrator might demonstrate that it used truly independent medical examiners or a neutral, independent review process”).

MetLife listened very carefully to what we said. It employed both of these methods: First, it walled off its claims administrators from its financial offices. And then, second, the claims office sought medical evaluations from outside, independent physicians who have no interest in MetLife’s finances. For its trouble, the majority is going to give MetLife additional scrutiny — -the majority is “skeptical” of MetLife precisely because it did what we told it to. Maj. Op. at 903-04. The majority’s new skepticism has been rejected by every other circuit to have considered it. When companies structure their operations in response to our opinions and then we penalize them for doing exactly as we have suggested, we sow uncertainty into both law and business. I dissent from Section II.B of the majority’s opinion. Because that section is not otherwise necessary to the majority’s opinion, I concur in the judgment.

I

The problems with the majority’s new concept of skepticism are at least three fold. First, nothing in our decisions supports the majority’s new concept of skepticism. Our relevant cases stand for two propositions: (1) a claims administrator has an internal conflict of interest when it stands to profit from the claims it denies; and, (2) administrators can cleanse this conflict by taking steps such as creating ethical walls, making sure that claim denials are not rewarded, and hiring outside physician reviewers. The majority takes these rules about internal conflicts of interest for which “some skepticism” is warranted, and discovers an external conflict in outside reviewers that is deserving of “modest but ... some, but not substantial” skepticism — something between “no skepticism” and “enhanced skepticism.” Maj. Op. at 903-04. The majority has taken a mechanism we gave administrators for cleansing their conflicts — hiring independent reviewers — and turned it into a new source of conflicts. Second, there is- no basis for inferring a conflict of interest in the outside reviewers simply because they reviewed multiple flies for MetLife and were compensated for their work. Finally, I have great reservations about the use of “skepticism” as a standard of review.

A

In Abatie, we considered the problem with “an insurer that acts as both the plan administrator and the funding source for benefits,” and explained that the insurer thus “operates under what may be termed a structural conflict of interest.” Abatie, 458 F.3d at 965. We discussed this sort of structural conflict only, and then commented that it could be remedied by, for example, using “independent medical examiners.” Id. at 969 n.7 (emphasis added).

MetLife has an internal conflict of interest in this case. It both pays the benefits and evaluates the claims. No one disputes it. And no one disputes that MetLife has addressed this internal conflict by walling off its financial department from its claims department. See Maj. Op. 901 (assuming that “there is no residual structural conflict ... because of affirmative steps taken by MetLife to insulate its claims department”). But the Majority then finds that an external conflict exists because MetLife refers files to outside physician reviewers.

The majority says that we are to view this relationship with skepticism, but I can find no basis in our decisions for this conclusion. Our cases simply hold that a structural conflict of interest may warrant skepticism, nothing more. See, e.g., Glenn, 554 U.S. at 112, 128 S.Ct. 2343 (holding that an ERISA conflict emerges where the same administrator “both funds the plan and evaluates the claims,” because “every dollar provided in benefits is a dollar spent by ... the [administrator]; and every dollar saved ... is a dollar in [the administrator’s] pocket”) (quotation omitted); Abatie, 458 F.3d at 965. The majority claims that in Montour v. Hartford Life & Accident Insurance Co., we “assumed that a relevant piece of extrinsic evidence would be ‘how frequently [the insurance company] contracts with the file reviewers it employed in this case.’ ” 588 F.3d 623, 634 (alteration in Montour). With respect, the majority overstates Montour. In Montour, we found that the Hartford Insurance Company, like MetLife, had a structural conflict of interest. But in Montour we also cited to extensive district court findings that caused us to conclude that “Hartford’s bias infiltrated the entire administrative decisionmaking process, which leads us .to accord significant weight to the conflict.” Id. We then observed that Hartford had failed to show any efforts that it had made “to ‘assure accurate claims assessment^]’ such as utilizing procedures to help ensure a neutral review process.” Id. (quoting Metro. Life, 554 U.S. at 117, 128 S.Ct. 2343). Notice what we said next: “To the contrary, in fact, Hartford’s nurse case manager took an advocacy position in her letters to Montour’s physicians soliciting their agreement with her disability conclusion.” Id. (emphasis added). Not only'had Hartford failed to present any evidence to show its neutrality, the evidence suggested that Hartford was telling Montour’s doctor what it wanted to hear. We then faulted the claimant, Montour — and here is the passage the majority relies on — for “not submitting] any extrinsic evidence of bias, such as statistics regarding Hartford’s rate of claims denials or how frequently it contracts with the file reviewers it employed in this case.” Id. Far from demonstrating that we should be suspicious of outside reviewers, Montour stands for nothing more remarkable than the proposition that it would be relevant to have known in that case where Hartford “took an advocacy position” in its letters to Montour’s doctor what the relationship, if any, was between Hartford’s rate of claims denials and its reliance on outside reviewers. Here, without the predicate for our comment in Montour, the majority simply assumes that the evidence that MetLife used and paid outsider reviewers is sufficient to show bias. The majority calls this a “financial conflict! ],” Maj. Op. at 901, but we have no explanation why an outside reviewer has a conflict of interest by virtue of being compensated for her time.

I don’t see the conflict at all. The Supreme Court in Firestone found that “ERISA abounds with the language and terminology of trust law,” and that we should be “guided by principles of trust law.” Firestone Tire & Rubber Co. v. Bruch, 489 U.S. 101, 111, 112, 109 S.Ct. 948, 103 L.Ed.2d 80 (1989). The trust analogy is an important one, because a trustee is under a dual obligation: It must ensure that trust funds are paid to those who are eligible; at the same time, it must ensure that trust funds are not paid to those who are not eligible. A trustee-administrator as surely violates its obligation by violating the one charge as the other. MetLife knows that, as trustee, it has an internal conflict of interest and must compensate for that conflict in some way, and perhaps in several ways. Going to outside consulting physicians — literally, getting a second opinion — is one way of compensating for the conflict. But where is the conflict in the outside consulting physicians? Reliability, not bias, is the incentive for the outside reviewer. In the long run, it doesn’t do a company like MetLife any good to get bad medical advice because just as it has a duty to pay deserving claims, it has a duty to avoid paying undeserving claims.

I recognize that there could be a cognizable conflict of interest where an independent physician is so dependent on an administrator that it effectively becomes an employee of the administrator, see McDonald v. Hartford Life Grp. Ins. Co., 361 Fed.Appx. 599, 609 (5th Cir. 2010), but this would simply bring the reviewer within the administrator’s internal umbrella. And in that case, the outside reviewer would be subject to the same conflict of interest as the administrator’s own employees; the outside reviewer cannot be more conflicted than the administrator’s own claim processors (assuming the administrator has not given the reviewer some additional incentive tied to results). See Armstrong v. Aetna Life Ins. Co., 128 F.3d 1263, 1265 (8th Cir. 1997) (insurer provided incentives and bonuses to claims reviewers for “claims savings”). What is so odd about the majority’s analysis is that if the outside consultants were MetLife’s own internal employees, we would find that they were not conflicted — or at the least that any conflict had been neutralized by MetLife’s claims-handling practices. But for the majority, because the reviewers are independent, suddenly they are untrustworthy. Yet there is not one hint in the record that these outside reviewers are given financial incentives based on their results or in any other way biased.

The principle the majority adopts has profound implications for other areas of the law — notably Social Security claims. The SSA, and its state partners, frequently rely on outside medical sources to review a claimant’s file and offer a second opinion. The views of these reviewing physicians are given significant weight under SSA regulations and our decisions. See 20 C.F.R. § 404.1519a-q (SSA rules governing the hiring and use of physician reviewers); Reed v. Massanari, 270 F.3d 838, 842 n.1 (9th Cir. 2001) (noting the “important role played by independent medical specialists” in SSA cases, and overturning the ALJ’s decision not to order an outside physician review); Kish v. Colvin, 552 Fed.Appx. 650, 651 (9th Cir. 2014) (overturning the ALJ’s determination because he failed to order an outside physician review); see also Standards For Consultative Examinations and Existing Medical Evidence, 56 Fed. Reg. 36932, 36949 (Aug. 1, 1991) (“[The SSA] spend[s] considerable sums annually to obtain consultative examinations.”). Sometimes the SSA uses outside physician reviewers, and sometimes it uses physicians employed full time by the SSA’s state-partner agencies. See Wilson v. Comm’r of Soc. Sec., 280 Fed.Appx. 456, 462 (6th Cir. 2008) (relying in part on evaluation conducted by psychologist employed by state agency); Patty v. Barnhart, 189 Fed.Appx. 517, 519 (7th Cir. 2006) (relying on both state-employed and outside physician reviewers); see also 20 C.F.R. § 404.1519g (stating the SSA may purchase an outside examination from any qualified provider, including the claimant’s own physicians or “another source”). So far as I know, we have never questioned the bona fides of reviewing physicians’ views on the grounds that SSA or state agencies are sending them lots of business and paying them well, or, even worse, employing them full time. It will turn our cases upside down if we start down that road.

B

Even if we considered Demer’s evidence on the outside reviewers in this case, there is “no there there.” The majority’s new skepticism is based on two facts: First, the majority thinks that MetLife’s outside reviewers are doing a lot of work for Met-Life and, second, the majority thinks the outside reviewers are getting paid a lot of money for their work. Maj. Op. at 902-03 (referring to the “magnitudes of these numbers”). Neither of these reasons will bear scrutiny.

Let’s start with the idea that the two doctors in question here — both of whom were board-certified in internal medicine— are doing a lot of work for MetLife. The record discloses that Dr. Del Valle reviewed some 250 files per year; Dr. Gor-dan, some 200-300 files. We have no basis for judging the significance of these numbers. The majority has just decided that these numbers are big. Let’s indulge the assumption: so what? Even if MetLife decided to stop paying these two doctors as outside consultants and brought them in-house, it wouldn’t implicate a conflict. Met-Life has followed our directions and walled off its claim processors — including any medical personnel — from its financial people. Its claim processors have not been given any incentives to deny claims. That is precisely what we said would cleanse MetLife’s conflicts. Abatie, 458 F.3d at 969 n.7. So whether the doctors are doing a little bit of work for MetLife or a lot of work is irrelevant because there is no smidgen of evidence in the record that MetLife will compensate the doctors based on the results of their evaluations. The majority has no basis whatsoever for impugning the reputations of these medical professionals.

The majority’s second theory- — that the doctors are well compensated — doesn’t hold any more water than the first. Dr. Gordan was compensated $175,000/year for his work; Dr. Del Valle, $125,000. Based on nothing more than its own ipse dixit, the majority declares this to be “significant compensation.” Maj. Op. at 900. We have nothing in the record to tell us how doctors are compensated, whether as practicing physicians or as outside consultants. I could opine that this doesn’t seem like a lot of money for an experienced medical professional, but I have no more basis than the majority for saying so. And perhaps there is some professional jealousy at work here, but if these doctors were lawyers, they aren’t even making first-year associate wages.

Would we feel better if MetLife found physicians who were willing to be poorly compensated? Does it get better advice? Does it avoid the conflict? The doctors’ compensation tells us nothing about how the doctors make their medical judgments. And if we are going to indict the medical profession for understanding the potential interests of its clients, do we get to be skeptical of Demer’s personal physician because Demer’s insurance is paying for his opinions? What if the doctor knows that Demer will go get a second opinion (and perhaps leave his personal physician and give him a poor review on Yelp) unless the doctor finds that Demer is disabled? Does Demer’s family doctor have a financial incentive to reach a particular medical judgment? What if Demer’s doctor refers him to a second physician for an opinion? Does the second doctor have a conflict of interest because he fears that if he doesn’t corroborate the referring doctor’s opinion he won’t get referrals in the future? There is no end to the gamesmanship that can be played here. •

C

That brings me to the question of “skepticism.” The majority, of course, should not be faulted for relying on our prior decisions and this “skepticism” scheme. We first used the term in Abatie, 458 F.3d at 968 (referring to the “level of skepticism” with which we view a “structural conflict of interest”). In Montour, we picked up on the “skepticism” comment and amplified it, finding that courts should “adjust the level of skepticism” “in accordance with the facts and circumstances of the case” and that where “the conflict may have tainted the entire administrative decisionmaking process,” the courts should view the administrator’s decision with “enhanced skepticism.” 588 F.3d at 631. We have continued to solidify and elevate Mont-our’s scheme. See Stephan v. Unum Life Ins. Co. of America, 697 F.3d 917, 934 (9th Cir. 2012); Salomaa v. Honda Long Term Disability Plan, 642 F.3d 666, 675 (9th Cir. 2011); Nolan, 551 F.3d at 1155. Even as we acknowledge that it is not “easy to decide how many metaphorical grams should go on the metaphorical scale,” we have added additional qualifiers to “skepticism” to give us a range of skepticism, including “special skepticism,” “additional skepticism,” and a “higher degree of skepticism.” Salomaa, 642 F.3d at 675-76.

I question the usefulness of “skepticism” as a standard of review. Standards of review are, necessarily, not “rules” and are subject to the vagaries of language. In Justice Frankfurter’s memorable phrase, standards of review capture a “mood,” but a “mood [that] must be respected, even though it can only serve as a standard for judgment and not as a body of rigid rules assuring sameness of applications.” Universal Camera Corp. v. NLRB, 340 U.S. 474, 487, 71 S.Ct. 456, 95 L.Ed. 456 (1951). We are burdened with a variety of “word formulas,” FTC v. Sun Oil Co., 371 U.S. 505, 527, 83 S.Ct. 358, 9 L.Ed.2d 466 (1963), to describe amorphous concepts such as evidence, error, and discretion. And we (and Congress) have developed lists of adjectives to qualify those concepts. Thus, we have at least six different kinds of “evidence”: a “scintilla of evidence,” “some evidence,” “substantial evidence,” a “preponderance of evidence,” “clear and convincing evidence,” and “evidence beyond a reasonable doubt.” It isn’t enough to commit an “error,” we must know whether it was “harmless error,” “clear error,” “plain error,” or “invited error.” When a court or agency exercises discretion, we need to know whether it was “an abuse of discretion” or merely “arbitrary and capricious.” And I won’t even go down the road of constitutional “scrutiny.”

The problem with “skepticism” is not that it is not descriptive in some useful way, but that we don’t need another noun modified by a raft of adjectives to capture our mood when we review decisions under ERISA. We know that the standard of review for ERISA plans in which some discretion is conferred on the plan trustee is the well-established “abuse of discretion.” Firestone, 489 U.S. at 115, 109 S.Ct. 948; Abatie, 458 F.3d at 962-63. That standard is hard enough to apply without adding a layer of “skepticism” on top of it. We are trained in the law, and we are, either by nature or by avocation, skeptical. It’s what we do. We are skeptical of what defendants, witnesses, and even lawyers tell us. We question their memories, reliability, and motives. And we have developed our own ranges of skepticism, from ordinary eyebrow-raising skepticism to the more ambiguous harrumphing skepticism to full-blown, exasperated snorting skepticism. In terms of expressing a mood, we are much better off with the facial expressions and the occasional snort than with trying to define the difference between “some skepticism,” “special skepticism,” “enhanced skepticism,” and “high-degree skepticism.” I would abandon “skepticism” as a separate standard of review in ERISA cases and try to deal with run-of-the-mill “abuse of discretion.”

II

These are problems of our own making. Every circuit that has considered the use that MetLife made of outside reviewers here has approved the practice, and they have managed to do so without invoking the standard of “skepticism.” The courts not only don’t apply even a modicum of skepticism, they often reject the claim that there is any conflict in the first place. For example, in Davis v. Unum Life Insurance Company of America, the Seventh Circuit noted the absurdity of turning “an administrator’s decision to seek independent expert advice [which] is evidence of a thorough investigation” into grounds for criticizing the administrator. 444 F.3d 569, 575-76 (7th Cir. 2006) (quotation omitted). The court explained that “[playing for a legitimate and valuable service in order to evaluate a claim thoroughly does not create a review-altering conflict.” Id. at 575. This was true even where the reviewers’ entire salary came from a single administrator. The Fifth, Sixth, and Tenth Circuits have adopted similar rules. See Hagen v. Aetna Ins. Co., 808 F.3d 1022, 1029 (5th Cir. 2015) (holding that the relationship between the administrator and reviewer was not enough, on its own, to show bias; there must be some evidence of specific stake in the claim); Hunt v. Metro. Life Ins. Co., 587 Fed.Appx. 860, 862 (6th Cir. 2014) (requiring more than mere evidence of payments between the administrator and reviewer); Benson v. Hartford Life &. Acc. Ins. Co., 511 Fed.Appx. 680, 685 (10th Cir. 2013) (rejecting claim of reviewer conflict because “[n]othing in the record suggested] a specific bias on the part of the ... reviewing physicians.”); Christoff v. Ohio N. Univ. Employee Ben. Plan, 495 Fed.Appx. 710, 712 (6th Cir.. 2012) (declining to find conflict of interest because the administrator used “independent reviewers” to deny claims).

Where the plan administrator is operating under a structural conflict of interest, courts have often concluded that the conflict is mitigated by the administrator’s decision to “reduce its inherent bias by hiring ... independent physicians.” Holcomb v. Unum Life Ins. Co. of Am., 578 F.3d 1187, 1193 (10th Cir. 2009); see also Fite v. Bayer Corp., 554 Fed.Appx. 712, 717 (10th Cir. 2014) (“Bayer took active steps to reduce any potential bias and to promote accuracy: it sought an independent review of Ms. Fite’s medical records by a different psychiatrist ... and it obtained an independent psychiatric evaluation of Ms. Fite from a fourth psychiatrist before reaching its final decision. We therefore give the conflict-of-interest factor limited weight in determining whether Bayer abused its discretion.”); Menge v. AT & T, Inc., 595 Fed.Appx. 811, 814 (10th Cir. 2014) (holding that an administrator had mitigated its “inherent conflict of interest” by “rel[ying] on the medical opinions of independent physician advisors”); Keith v. Prudential Ins. Co. of Am., 347 Fed.Appx. 548, 552 (11th Cir. 2009) (holding that an administrator’s use of three independent physician reviewers showed the administrator was not “influenced by [its] [structural] conflict”). Indeed, it is more likely that courts will hold it against an administrator who doesn’t use independent reviewers. As the First Circuit explained, it is “difficult to fault a plan administrator for seeking expert assistance (indeed, it probably would be easier to fault a plan administrator for not seeking such assistance).... [C]ommon sense dictates that retaining outside physicians to assist in evaluating disability claims, without more, does not constitute a conflict of interest.” Leahy v. Raytheon Co., 315 F.3d 11, 16 (1st Cir. 2002); see also Loan v. Prudential Ins. Co. of Am., 370 Fed.Appx. 592, 598 (6th Cir. 2010) (“Prudential did not seek the opinion of an outside expert who was not operating under Prudential’s conflict-of-interest.... Prudential’s, in-house doctor had less incentive than an independent outside doctor to conduct a thorough and accurate review of the record and address the arguments Plaintiffs raised on appeal.” (emphasis added)).

Ill

The majority takes the remedies we offered to administrators for cleansing conflicts — walling off its claim processors and hiring independent reviewers — and turns it into a sword to punish administrators with skepticism. After today’s decision, we cannot fault administrators for their confusion over what they can rely on in our decisions. And we can predict with near certainty how they will respond: at least in our circuit, administrators will stop using outside, independent reviewers; instead, they will try to bring them in house where, they hope, we will still respect the administrator’s efforts to wall them off. Today’s decision injects confusion and change for no reason. We are not likely to end up with better decisions in ERISA claims.

Count me skeptical. I respectfully dissent from Section II.B. 
      
      . As indicated above, Dr. Weidman submitted a more recent statement from February 2010, indicating that Mr. Demer's condition had worsened after April 2009.
     
      
      . After MetLife denied his appeal, Mr. Demer sent a letter to MetLife, claiming that there was additional information from Dr. Weid-man and Dr. Moore that had been sent to MetLife prior to the decision on appeal but that had not been addressed in the decision on appeal. Mr. Demer attached that information to his letter. That information included, inter alia, a treatment note from Dr. Weidman indicating that Mr. Demer was on a higher amount of opiate analgesics which seemed to cause a slight change in his speech and a treatment note from Dr. Moore stating that Mr. Demer was taking opiates which affected his cognition and executive functioning, including memory. The documentation was reviewed by a MetLife appeals nurse consultant. “The nurse consultant opined that while some current clinical exam changes were noted, no additional clinical findings were submitted relating to the appeal period in question....” MetLife also noted that the Plan had only “one level of appeal" and that appeal had already been denied on May 6, 2011.
      At the trial level, the district court refused to consider Mr. Demer's post-appeal evidence. Mr. Demer now argues that this refusal was erroneous. For purposes of this appeal, we need not decide the issue of whether the post-appeal evidence should have been considered. Even without the post-appeal evidence, Mr. Demer is entitled to a remand, as discussed below.
     
      
      . The dissent is critical of our review for abuse of discretion with skepticism because, inter alia, the term skepticism is “not descriptive in some useful way,” and even less so when "modified by a raft of adjectives.” However, the framework employing abuse of discretion review subject to some degree of skepticism (where warranted) is well established under both Glenn and Ninth Circuit law. See Glenn, 554 U.S. at 117, 128 S.Ct. 2343 (noting that requiring consideration of a conflict of interest as a factor "is no stranger to the judicial system” as "[n]ot only trust law, but also administrative law, can ask judges to determine lawfulness by taking account of several different, often case-specific, factors, reaching a result by weighing all together”); Abatie, 458 F.3d at 969 (noting that "abuse of discretion review, with any 'conflict ... weighed as a factor,' is indefinite” but "trial courts are familiar with the process of weighing a conflict of interest”).
     
      
      . The district court did not err in considering the Social Security Administration’s ("SSA") denial of Mr. Demer’s claim for disability benefits as additional evidence that MetLife did not have a conflict of interest. See Demer, 975 F.Supp.2d at 1077 (stating that, “[a]l-though not a decision by an administrative law judge, the SSA's findings support the objectivity of MetLife's review of the medical evidence”). Contrary to what Mr. Demer suggests, the district court did not rely on the SSA decision to support MetLife’s ruling on the merits. See Hariick, 686 F.3d at 719-20 (stating that "[t]he general rule ... is that a court will not allow an ERISA plan administrator to assert a reason for denial of benefits that it had not given during the administrative process").
     
      
      . In so ruling, we acknowledge the Supreme Court’s statement in Glenn that that it did not "believe it necessary or desirable for courts to create special burden-of-proof rules, or other special procedural or evidentiary rules, focused narrowly upon the evaluator/payor conflict.” Glenn, 554 U.S. at 116, 128 S.Ct. 2343. However, we do not read this language as barring the burden approach articulated above. Glenn's statement was directed at the issue, once a conflict is identified, *' 'how' the conflict ... should ‘be taken into account on judicial review of a discretionary benefit determination.’ " Id. at 115, 128 S.Ct. 2343 (citation omitted). The Supreme Court did not consider and hence did not foreclose an articulation of the burdens in determining whether there is a cognizable conflict in the first place.
     
      
      
        .See also Davis v. Unum Life Ins. Co. of Am., 444 F.3d 569, 575-76 (7th Cir. 2006) (rejecting contention that “in-house doctors have an inherent conflict in every case”; noting lack of evidence of "any specific incentive [for the in-house doctors] to derail [a] claim” — e.g., giving the doctors "some specific stake in the outcome of [a] case, such as paying the doctors more if [the] claim were denied”). While such a stake would be strong evidence of conflict, as a practical matter, this seems a highly unlikely scenario. It is hard to imagine that a plan administrator would explicitly tie compensation to results, as clearly such a practice would be viewed disapprovingly by courts. To the extent the dissent suggests that no financial conflict of interest may be found absent evidence that an IPC has a specific stake in the outcome of the case, such a rule, if adopted, would render review for financial conflict of interest toothless. Nothing we said in Montour suggests such a categorical rule.
     
      
      . See, e.g., Caplan v. CNA Fin. Corp., 544 F.Supp.2d 984, 992 (N.D. Cal. 2008) (noting that a doctor "stood to benefit financially from the repeat business that might come from providing [defendant] with reports that were to its liking”; adding that "[t]he history of' [the doctor’s] conclusions provides evidence of this conflict”).
     
      
      . For example, MetLife could have provided but did not proffer evidence that the IPCs being challenged do not in fact have a parsimonious pattern of assessment unfavorable to claimants. Nor did MetLife submit evidence, e.g., that the fees paid to the IPCs constitute only a small fraction of their income or that the high number of reviews conducted by an IPC in a particular year was an aberration.
     
      
      . Contrary to what the dissent seems to suggest, bias of an IPC may be inferred even where the IPC is not entirely "financially dependent” on income received from an administrator. Obtaining even, e.g., 30% of one's income from one administrator could be sufficiently influential as to give rise to a reasonable inference of bias. See, e.g., Nolan, 551 F.3d at 1152 n.3 (30% of reviewer’s income came from administrator). Moreover, were claimants required to show financial dependence in order to establish lack of neutrality, the personal financial circumstances and needs of each IPC could be subjected to routine inquiry in ERISA cases, a result hardly conducive to the recruitment of competent reviewers or to the efficient and expeditious review of benefit decisions contemplated by ERISA.
     
      
      . Similarly, a medical record indicated that Mr. Demer's father told him “he was ‘druggy’ ” after being prescribed certain medication (medical record from treating physician, Dr. Debra Weidman (anesthesiologist)).
     
      
      . Because Ms. Piel’s statement was submitted only on the appeal to MetLife, Dr. Del Valle was not able to consider it. However, Dr. Gordan should have taken into account Ms. Piel’s statement as he was the IPC on appeal.
     
      
      . We acknowledge that Dr. Gordan did consult with Mr. Demer's treating physician, Dr. Moore, and Dr. Moore indicated to Dr. Gor-dan that "it was likely that Mr. Demer could do a very sedentary job.” Dr. Moore, however, added that "he felt that he would have to see [Mr. Demer] again to say that definitively.” Yet Dr. Gordan never physically or mentally examined Mr. Demer for confirmation one way or the other; nor did Dr. Moore. Hence, any reliance by MetLife on Dr. Moore's statement is misplaced.
     
      
      . Montour cited to Nolan v. Heald College, 551 F.3d 1148 (9th Cir. 2009), where, in a footnote, we suggested that there might be some relationship between an outside reviewer’s regular contract with the insurance company and the reviewer’s bias. Id. at 1152 n.3. We offered no explanation why this was a problem. We later commented that “This is not to say that [the insurance company] was not entitled to rely on the opinions of its independent physicians,” but we assumed that it was evidence of bias that the district court might have viewed “with skepticism.” Id. at 1155.
     
      
      . The majority contends that I have "mistakenly equate[d] outside experts with independent experts.'” Maj. Op. at 903. Not so. I have simply equated better-paid outside experts with slightly-lesser-paid outside experts. The question in this case is not whether there is any circumstance in which an outside expert could lose her independence — of course there is'. The question here is whether paying a board-certified physician an amount that strikes the majority as a lot, on its own, makes that physician no longer "independent.” And to me, the answer must be no.
     
      
      .It is far from clear from the cases (including the record in this case) what files we think the outside physicians are reviewing. Are the reviewers looking at all files, including files the claims department is inclined to pay? If so, why would the reviewing physicians offer anything other than their best view?
      But what if the claims department is only sending outside reviewers the files of claims it is inclined to deny? In that case the outside reviewers still have little incentive to shade their views in favor of the claims department. This is not advocacy, it is medical opinion, and the claims department, after all, is asking the reviewers for a second opinion before denying the claim.
      If there were evidence an administrator was signaling to its outside reviewers what result it wanted them to reach, that would be a serious problem. In such a case, the signaling would be direct evidence that the administrator had not addressed its structural conflict of interest.
     
      
      . The majority's approach to the burdens here makes little sense. If the claimant bears the burden, as Montour suggests, 588 F.3d. at 634, then gaps in the record — as we have here-— should weigh against Demer. The majority fails to explain how pointing to two numbers (the number of files received and the sum of payments received) establishes that the outside reviewers were biased.
     
      
      . See David Lat, Breaking: NY to $180KH! Cravath Raises Associate Base Salaries!!!, June 6, 2016, http://abovethelaw.corn/2016/06/ breaking-ny-to-180k-cravath-raises-associate-base-salaries/.
     
      
      . The majority suggests that its approach to analyzing outside experts is unremarkable, and points out that other circuits do not hold "that outside experts are immune from judicial scrutiny for possible bias.” Maj. Op. at 904. We can (and should) scrutinize outside experts when there are facts that demonstrate that their judgment is not independent. But that is not where I and our sister circuits depart from the majority. Our disagreement is with the majority's contention that the bare fact of compensating an outside expert for her opinion makes her biased.
     