
    Diane DENMARK, Plaintiff, Appellant, v. LIBERTY LIFE ASSURANCE COMPANY OF BOSTON, and The Genrad, Inc. Long Term Disability Plan, Through Teradyne, Inc., as Successor Fiduciary, Defendants, Appellees.
    No. 05-2877.
    United States Court of Appeals, First Circuit.
    Heard Sept. 13, 2006.
    Decided March 28, 2007.
    
      Jonathan M. Feigenbaum, with whom Phillips & Angley was on brief, for appellant.
    Andrew C. Pickett, with whom Ashley B. Abel, Richard W. Paterniti, and Jackson Lewis LLP were on brief, for appellees.
    Before SELYA, LIPEZ, and HOWARD, Circuit Judges.
   LIPEZ, Circuit Judge.

This seemingly straightforward appeal of a denial of disability benefits presents difficult issues involving our standard of review in cases arising under the Employee Retirement Income Security Act (“ERISA”), 29 U.S.C. §§ 1001-1461. In fact, the issues are so difficult that this case has generated three opinions. Although I agree with the district court that the insurer’s denial of disability benefits was neither arbitrary nor capricious, I believe that our circuit should reexamine in an en banc proceeding the standard of review that applies when an insurer both reviews and pays disability claims, resulting in a structural conflict of interest. Judge Selya concurs in the judgment affirming the decision of the district court, but disagrees with my assessment of the standard of review. Finally, although Judge Howard dissents, believing that the disability benefits denial was arbitrary and capricious under our current standard of review, he joins me in concluding that our circuit should reexamine our standard of review in these structural conflict cases.

I.

Appellant Diane Denmark (“Denmark”) is a former participant in a group long term disability insurance plan offered through her employer, GenRad, Inc. (“GenRad”), and its successor in interest, Teradyne, Inc. (“Teradyne”). Appellee Liberty Life Assurance Company (“Liberty”), the plan insurer, denied Denmark’s claim for long term disability benefits on August 20, 2002. After exhausting her administrative appeals, Denmark filed suit. On cross-motions for summary judgment, the district court found that the decision to deny benefits was neither arbitrary nor capricious and entered summary judgment for defendants.

On appeal, Denmark argues that this court should subject Liberty’s benefits decision to de novo review on several grounds: an improper delegation of discretionary authority, a structural conflict of interest resulting from Liberty’s dual responsibility for making benefits determinations and paying claims, and the fact that the entire benefits decision was “infected with conflict.” To further complicate matters, the district court employed a heightened standard of review with respect to one medical opinion as a sanction for Liberty’s refusal to comply with a discovery order, and we must consider how to deal with that aspect of the court’s decision. Finally, Denmark contends that Liberty’s benefits decision cannot survive even deferential arbitrary and capricious review.

A. Factual Background

The facts are undisputed (although the conclusions and inferences to be drawn from them are not). This opinion will recite them here in some detail to facilitate the analysis of the issues raised by the parties. Denmark began working at Gen-Rad on April 2, 1973 as a Group Leader in Manufacturing Inspection. Her job involved inspecting GenRad products and overseeing other inspection personnel.

She was diagnosed with fibromyalgia in 1996 by her primary care physician, Dr. Gregory Malanoski. Fibromyalgia is a chronic disorder characterized by muscle pain and fatigue that can be alleviated, but not cured. Certain drug regimes and physical therapy often help to improve a patient’s condition.

Despite her illness, Denmark continued working for several years, including during several periods when her symptoms worsened. On October 3, 2001, she stopped working for health reasons. She has not returned to work.

At the time she left work, Denmark was covered under GenRad’s Short Term Disability Benefits Plan (“STD plan”) and its Long Term Disability Benefits Plan (“LTD plan”). Liberty served as the claims administrator for the STD plan, providing an initial claims review and a decision for STD claims submitted by GenRad employees. GenRad then reviewed appeals of STD benefit denials and paid meritorious claims. For the LTD plan, Liberty both made benefits determinations and paid for claims out of its own assets. At some point in late 2001, Teradyne acquired Gen-Rad, but Denmark’s disability benefits under the two plans remained the same.

1. Short Term Disability Claim

GenRad’s STD plan defines “disabled” to mean that the claimant is “unable to perform all of the material and substantial duties of [her] occupation on an Active Employment basis because of an Injury or Sickness.”

a. Initial Review by Nurse Kaye

Denmark filed for STD benefits shortly after she stopped working on October 3, 2001. Her claim was reviewed by Nurse Debra Kaye, a Liberty Disability Case Manager. Kaye reviewed medical records provided by Dr. Malanoski, Denmark’s primary care physician; Dr. Thomas Goodman, a rheumatologist to whom Dr. Malanoski referred Denmark; and Dr. Terrence Hack, a cardiologist. She also reviewed a description of Denmark’s job from GenRad.

Dr. Malanoski, who originally diagnosed Denmark with fibromyalgia in 1996, examined her on October 4, 2001, the day after she stopped working. His notes from that date state that Denmark was “[d]oing poorly: much worse myalgia generally,” “[hjeadache, hard to get out of [illegible], general point muscle tenderness.” He also listed the nine drugs that Denmark was taking and concluded: “No work until further [follow-up].” In an Attending Physician Statement reporting on that visit, dated November 6, Dr. Malanoski considered Denmark’s physical impairment “Class 5— Severe limitation of functional capacity, incapable of minimum activity.” In response to a question asking for the “objective medical findings that support the above restrictions and limitations,” he wrote: “diffuse muscle tenderness,” “weakness,” and “fatigue.”

Dr. Malanoski referred Denmark to Dr. Goodman, a rheumatologist who saw her on October 8. Dr. Goodman noted that she experienced “palpitations, sharp (‘stinging’) pain” and “fatigue/exhaustion/myal-gia/insomnia,” and was “stiff, tired, exhausted in AM. Needs afternoon nap.”

Dr. Terrence Hack, a cardiologist, prescribed many medications to help manage Denmark’s angina and high blood pressure. He also provided a report indicating that there were no serious cardiac arrhyth-mias or other cardiac symptoms that rendered Denmark unable to work.

GenRad’s description of Denmark’s position listed the physical demands: “Bending, squatting and body movement involved inspecting external and internal components of various products. Ability to utilize material handling equipment to move test equipment and position product. Occasional lifting of 25 pounds.”

After reviewing this information, Nurse Kaye noted in Liberty’s claim record on November 14, 2001 that “there is no indication as to what has changed to warrant [restrictions/limitations] provided by [her primary care provider] to justify [stopping work].” She acknowledged that “Clmnt has [history] of fibromyalgia x5 years as diagnosed by her [primary care provider], with episodes of flare in condition,” but added that “there is no evidence that clmnt needed to cease occupational functioning & in fact, was able to function in an occupational setting full time, working long hours.”

b. Peer Review by Dr. Miller

Nurse Kaye requested a peer review from Dr. Clay Miller, an independent physician specializing in physical medicine and rehabilitation. She asked: “Does the accompanying documentation provide objective findings that would indicate a significant change in condition that was evident on or about the date of disability?”

On December 5, Dr. Miller responded that the documentation was insufficient because it did not “provide objective findings that would indicate a significant change in condition that was evident on or about the date of disability.” He elaborated: “The documentation provided does not indicate a significant change in the patient’s condition about the time of disability 10/03/2001 because there are no objective physical functional deficits documented and the patient had a normal cardiac exercise test 11/09/2001.” He also limited the significance of the reports of Dr. Malanoski and Dr. Goodman for the purposes of a disability benefits determination, stating:

[t]he rheumatology exam was positive only for the 18-fibromyalgia tender points. There are no documented objective physical exam findings that support a decrease or significant change in this patient’s physical condition. In fact, the patient had a normal cardiac exercise test 11/09/2001. Therefore, the medical records provided do not substantiate that the patient’s condition significantly changed about the time of disability 10/03/2001.

c. Denial of Short Term Disability Benefits

The Liberty claim record includes entries on December 6, 2001 that summarize Dr. Miller’s peer review: “Dr[.J indicates review of the medical information does not provide objective findings that would indicate a significant change in clmt’s condition on or about [date of disability].” The claim record repeatedly notes the lack of objective findings.

In addition to his notes described above, Dr. Goodman, the rheumatologist, submitted a follow-up letter dated December 11, 2001. He stated that Denmark’s symptoms of fatigue, exhaustion, myalgia and insomnia had become “quite marked over the last year or so, such that she is unable to perform her usual work as a quality control group leader.” Noting that “[t]his work requires her to be on her feet all day,” he concluded that “she remains totally disabled in terms of her line of work.”

In a letter dated December 26, 2001, Liberty denied Denmark’s claim for STD benefits. The letter cites the records provided by Drs. Malanoski, Goodman, and Hack, as well as the peer review of Dr. Miller. The letter explained that the lack of documented objective physical exam findings supporting a significant change in Denmark’s physical condition made it impossible to define “restrictions and limitations” resulting from her condition that would preclude her from performing her job responsibilities at GenRad.

d. Appeal of Short Term Disability Benefits Decision

Denmark appealed her denial of STD benefits in January 2002. Liberty informed Denmark that her employer, now Teradyne, reviewed all appeals itself. Liberty also sent letters to Drs. Malanoski and Goodman inviting them to express disagreement with Dr. Miller’s peer review. Dr. Malanoski responded on January 14, 2002, stating:

Diane Denmark is a patient of mine who suffers from substantial symptoms of fi-bromyalgia. I strongly disagree with your peer review decision not to provide disability benefits. As you know, fibro-myalgia is a condition lacking abnormalities in blood testing or specific abnormalities in physical exam. [Denmark] describes ... extreme pain, fatigue, and sleep disorder (among others). Dr. Goodman, our Rheumatologist specialist, agrees with her degree of disability.

Teradyne decided to have Denmark examined by an independent medical examiner (“IME”). Dr. Peter Schur evaluated Denmark and, in a letter to Dr. Goodman on April 12, concluded that “at least for the time being, she is clearly disabled not only from work, but from being able to take care of her household.” He explained that Denmark “can no longer do housework, cook, needs help shopping, although she is still able to do her laundry”; that she is constantly fatigued and “has lots of aches and pains all over her body”; that the range of motion in her limbs is limited; and that she “clearly has a sleep disorder.” As to the diagnosis of fibromyalgia, Dr. Schur noted that Denmark’s “shoulder and pelvic girdle problems plus her history of an elevated sedimentation rate would make one suspect PMR [polymyalgia rheu-matiea], but I think most of this, in fact, as suggested by others, is classical fibromyal-gia.” He concluded that modifying Denmark’s regime of exercise and medication may help her “get her stamina back and be able to go back to work.”

Based on this independent evaluation by Dr. Schur, Teradyne determined that it would pay for six months of STD benefits for Denmark, retroactive to her date of disability.

2. Long Term Disability Benefits

Under the LTD benefits plan, one qualifies as disabled if, for the first two years, “the Covered Person, as a result of Injury o[r] Sickness, is unable to perform the Material and Substantial Duties of his Own Occupation” and “thereafter, the Covered Person is unable to perform, with reasonable continuity, the Material and Substantial Duties of Any Occupation.”

a. Review by Nurse Kaye and Denial of Benefits

Denmark filed for LTD benefits in June 2002. Nurse Kaye again reviewed Denmark’s medical file, focusing on the information added since the conclusion of Liberty’s initial review of Denmark’s claim on December 10, 2001. Since that time, Denmark had submitted Dr. Goodman’s letter of December 11, 2001, Dr. Malanoski’s letter of January 14, 2002 and Dr. Schur’s IME report as part of the STD appeals process conducted by Teradyne. However, Denmark did not add any additional medical reports regarding her physical limitations for her LTD benefits claim.

In addition, Kaye reviewed an activities questionnaire that Denmark completed on July 23, 2002, in which Denmark stated that she could sit for only thirty minutes at a time, stand for ten minutes, walk for ten to fifteen minutes, sit in a car for twenty minutes, and drive for five minutes. She also stated that she spends fourteen hours a day in bed and needs help grocery shopping, carrying groceries, and washing and drying her hair. Finally, she noted that she “can not concentrate for very long.

I go to do something then I find myself standing in the room wondering what I was going to do.”

Kaye also reviewed the Department of Labor’s general description of Denmark’s position, last updated in 1988. This description listed the physical requirements of Denmark’s job as “Light,” stating that she would need to lift, carry or pull 20 pounds “occasionally,” up to ten pounds “frequently,” and smaller weights “constantly.” The job could include frequent walking and standing.

After her review, Kaye again concluded that there had been no “significant change” in Denmark’s condition around her claimed date of disability. Denmark had had fibromyalgia for six years and “managed to maintain occupational functioning during that time, even during periods of exacerbation. This diagnosis has been confirmed by multiple providers.” Kaye limited the significance of the IME report by Dr. Schur, noting that “[s]ince an IME provides an examination on a specific date in time, its scope is limited in that inferences to the status of conditions 6 months previous cannot be accurately assessed.” Finally, Kaye questioned the extent of Denmark’s disability, stating that she “may be self-limiting her work or social activities, with no objective medical basis to support [restrictions and limitations]” and that her “perception that she is unable to work & is seeking permanent benefits” was a “barrier[] to recovery.”

On August 20, 2002, Liberty denied Denmark’s LTD benefits claim, citing the reports of Drs. Malanoski, Goodman, Hack, Miller, and Schur. The denial letter states: “Our assessment of the new information provided by Dr. S[c]hur does not find any information to alter our previous findings that there was no significant change in Ms. Denmark’s condition on October 3, 2001 which would preclude her from performing the Material and Substantial duties of her occupation.”

b. Appeal of Long Term Disability Benefits Decision

Denmark requested review of the denial of her LTD claim in September 2002. She provided all of her medical records, including a report from a psychological examination by Dr. Taylor that was completed on July 24, 2002, but apparently was not included in Nurse Kaye’s review. Dr. Taylor described Denmark’s illness as “well documented,” and concluded: “She presents in an honest and straightforward manner and I suspect that she has some significant physical limitations, as noted by Dr. Schur, M.D. I do not believe that her depression, in and of itself, is of sufficient severity to prevent her from gainful employment.” Again, however, Denmark did not add any additional medical reports regarding her physical limitations.

Liberty arranged for surveillance of Denmark to monitor her activity level on October 24 and November 2, 4, and 5, 2002. On October 24, the investigator observed Denmark running errands for about two and a half hours. The investigator photographed her lifting groceries such as a gallon of water with one hand and placing them in her car. The investigator also reported that Denmark was “walking and moving in a fluid non-obstructed manner, bending and lifting items such as a case of soda and gallon of milk without difficulties.” On another day Denmark ran errands for about an hour and a half, and on the last two days the investigator did not observe any activity. Liberty’s claim log notes that the activity check “found her to be very active running errands,” but that “the investigator had no video camera on him ... so we only have stills of her and not actual moving footage. Will go [a]head and have peer review perform[ed.] I was hoping to supply them with surveillance footage but as there wasn’t much will not send.”

Liberty also obtained a Labor Market Survey regarding Denmark’s job. The survey examined various sources and conducted labor market research, finding that “[p]hysical demands were described as sedentary to light with occasional lifting up to 20 pounds” and that “[t]he opportunity to change positions from sitting, standing and walking was provided throughout the workday.”

Liberty submitted all the information it had accumulated to the Network Medical Review — Elite Physicians (“NMR”) for a final medical review. Dr. John Bomalaski did the review and wrote a report on December 4, 2002. He stated that the primary diagnosis “may be” fibromyalgia, but that “[t]he physical examination and testing do not support the diagnosis of Ms. Denmark’s treating physicians, at least within the records provided. There is no documentation of laboratory testing ruling out coexisting causes of myalgia.” Dr. Bomalaski stated that “the diagnosis of fibromyalgia remains in question not only by this reviewer but also by Dr. Schur, the consultant rheumatologist who had examined Ms. Denmark at the request of her treating rheumatologist, Dr. Goodman.”

Dr. Bomalaski also evaluated Denmark’s functional limitations. Relying on the surveillance evidence, he noted that she can engage in activities such as sitting, standing, walking, driving, reading, and lifting up to twenty pounds “occasionally,” meaning “up to 1/3 of the time,” and observed that she is “able to lift a gallon of water (10 pounds) with one hand and place in car from grocery cart.” He concluded that Denmark was “capable of working full time in a primarily sedentary position within the limitations and restrictions noted on the Functional Capacities Form.”

On December 10, 2002, Liberty informed Denmark’s attorney that Liberty was upholding its decision to deny LTD benefits. The denial letter states that the review found no impairment that would prevent Denmark from performing her own occupational job duties. Liberty’s case log states that “there was limited evidence to support a need to cease working and she was able in fact to work at her own occupation for long hours. There was no indication of what changed in her condition.”

3. Social Security Determination

On January 31, 2004, an Administrative Law Judge (“ALJ”) issued a decision that Denmark was entitled to Social Security disability benefits retroactive to October 2, 2001, the day before she stopped working. The ALJ concluded that Denmark had been “disabled” within the meaning of the Social Security Act since that date, because her “severe pain, limitations, and restrictions ... prevent [ ] her from performing her past relevant work” and because she has suffered a “substantial loss of ability necessary to perform a significant number of jobs” identified as unskilled sedentary occupations. The ALJ noted that Denmark’s testimony was credible and gave controlling weight to additional reports that Drs. Goodman and Hack submitted for the Social Security benefits determination.

Based on this favorable decision, Denmark again requested that Liberty review the denial of LTD benefits and submitted a new letter from Dr. Goodman. However, Liberty replied on June 3, 2004 that the Social Security disability benefits decision did not affect its prior denial determination.

B. Procedural History

On September 17, 2004, Denmark filed an action against Liberty in Superior Court in Massachusetts seeking LTD benefits allegedly due under ERISA, 29 U.S.C. §§ 1132(a)(1)(B), (2) and (3), and for breach of contract. On October 27, Liberty removed the case to federal district court in Massachusetts. Denmark did not pursue her breach of contract claim in federal court.

After discovery was completed, both parties moved for summary judgment. In a detailed opinion, the district court granted summary judgment for Liberty. The court found that, because the ERISA plan granted discretionary authority to Liberty, the benefits decision should be reviewed under the arbitrary and capricious standard. Although the court acknowledged the potential financial conflict of interest presented by Liberty’s dual responsibility for making benefits determinations and paying claims, it found no evidence that this financial arrangement had risen to the level of an actual conflict that would warrant altering the standard of review. The court then thoroughly examined the evidence supporting Liberty’s denial of LTD benefits to Denmark, ultimately concluding that this denial, supported by substantial evidence, was neither arbitrary nor capricious. In light of the record and precedents, this decision is affirmed.

II.

A. Summary Judgment Standard in an ERISA Action

Summary judgment is appropriate when “the pleadings, depositions, answers to interrogatories, and admissions on file, together with the affidavits, if any, show that there is no genuine issue as to any material fact and that the moving party is entitled to judgment as a matter of law.” Fed.R.Civ.P. 56(c). In the ERISA context, however, we have held that “the district court sits more as an appellate tribunal than as a trial court”; instead of considering affidavits submitted to the court, it “evaluates the reasonableness of an administrative determination in light of the record compiled before the plan fiduciary.” Leahy v. Raytheon Co., 315 F.3d 11, 18 (1st Cir.2002). Thus, because the court’s review is usually based only on the administrative record, “summary judgment is simply a vehicle for deciding the issue,” and consequently “the non-moving party is not entitled to the usual inferences in its favor.” Omdorf v. Paul Revere Life Ins. Co., 404 F.3d 510, 517 (1st Cir.2005).

B. De Novo or Deferential Review

Appellate review of the district court’s grant of summary judgment is de novo. See, e.g., Wright v. R.R. Donnelley & Sons Co. Group Benefits Plan, 402 F.3d 67, 73-74 (1st Cir.2005). However, the issue of what standard of review this court applies to the administrative record can raise more difficult questions. Denmark makes three arguments in support of de novo review. First, she claims that de novo review is warranted because Liberty was not granted discretionary authority under the terms of the LTD plan. Second, she requests that we reconsider our precedent and hold that, because Liberty both reviews and pays claims, there is a structural conflict of interest requiring de novo review. Finally, she argues that the entire review process was “infected with conflict.” This opinion will address each of these arguments in turn.

1. Grant of Discretionary Authority in an ERISA Instrument

In Firestone Tire & Rubber Co. v. Bruch, 489 U.S. 101, 115, 109 S.Ct. 948, 103 L.Ed.2d 80 (1989) the Supreme Court held that “a denial of benefits challenged under [29 U.S.C.] § 1132(a)(1)(B) is to be reviewed under a de novo standard unless the benefit plan gives the administrator or fiduciary discretionary authority to determine eligibility for benefits or to construe the terms of the plan.” If the ERISA plan grants the plan administrator such discretionary authority in the determination of eligibility for benefits, the administrator’s decision must be upheld unless it is arbitrary, capricious, or an abuse of discretion. Wright, 402 F.3d at 74.

The only document in the record describing the LTD plan is the insurance policy. Section 7 of the policy provides that “Liberty shall possess the authority, in its sole discretion, to construe the terms of this policy and to determine benefit eligibility hereunder. Liberty’s decisions regarding construction of the terms of this policy and benefit eligibility shall be conclusive and binding.” In light of this language, the district court concluded that “the record unequivocally shows that Liberty is a fiduciary within the meaning of ERISA because it acted within the capacity of manager and administrator of Gen-Rad’s LTD benefits plan,” and consequently held that the decision denying Denmark LTD benefits was subject to arbitrary and capricious review.

On appeal, Denmark argues that Liberty “granted discretion to itself without authority to do so.” While acknowledging that the language in Section 7 could provide a sufficient grant of discretionary authority, she protests that in this case authority was not properly delegated to Liberty in an ERISA plan instrument. Denmark first contends that the LTD policy in the record was not an ERISA plan instrument, but rather an insurance policy that Liberty itself wrote. She then asserts that GenRad never properly delegated discretionary authority to Liberty.

The improper delegation argument involves the interaction of several provisions of ERISA. As an employer who established the LTD plan, GenRad — later Tera-dyne — is the plan’s sponsor. See 29 U.S.C. § 1002(16)(B). Because the LTD policy does not designate an administrator, GenRad is also the administrator. See id. § 1002(16)(A). In light of GenRad’s statutorily defined role as plan administrator, Denmark argues that the LTD policy did not contain a sufficient delegation of fiduciary authority from GenRad to Liberty. ERISA states that “[t]he instrument under which a plan is maintained may expressly provide for procedures (A) for allocating fiduciary responsibilities ... among named fiduciaries, and (B) for named fiduciaries to designate persons ... to carry out fiduciary responsibilities.” Id. § 1105(c)(1). Denmark thus argues that the LTD policy needed to designate Liberty to carry out fiduciary responsibilities with an explicit statement such as “Gen-Rad vests Liberty with fiduciary responsibility including making benefit decisions.”

Liberty counters that an insurance policy can serve as an ERISA plan instrument. Moreover, it argues that the lack of explicit delegation is irrelevant because, for all practical purposes, it performed the role of a fiduciary of the policy.

Under 29 U.S.C. § 1102(a)(1), “[ejvery employee benefit plan shall be established and maintained pursuant to a written instrument.” Courts have held that an insurance policy may serve as this written instrument. See, e.g., Krohn v. Huron Mem’l Hosp., 173 F.3d 542, 551 (6th Cir.1999)(“[T]he long-term disability policy ... is the written instrument pursuant to which this employee benefit plan was established and maintained, as required by ERISA.”); Cinelli v. Sec. Pac. Corp., 61 F.3d 1437, 1441 (9th Cir.1995)(“[I]t is clear that an insurance policy may constitute the “written instrument’ of an ERISA plan.”); Gable v. Sweetheart Cup Co., 35 F.3d 851, 856 (4th Cir.l994)(“An insurance policy may constitute the ‘written instrument’ of an ERISA plan....”).

However, a policy which serves as the written instrument must

(1) provide a procedure for establishing and carrying out a funding policy and method consistent with the objectives of the plan and the requirements of this subchapter,
(2) describe any procedure under the plan for the allocation of responsibilities for the operation and administration of the plan (including any procedure described in section 1105(c) of this title),
(3) provide a procedure for amending such plan, and for identifying the persons who have authority to amend the plan, and
(4)specify the basis on which payments are made to and from the plan.

29 U.S.C. § 1102(b). Our examination of the LTD policy satisfies us that it complies with these requirements. Section 8 explains that Liberty sets the premiums, specifies the conditions under which the premiums can be changed and provides for payment of premiums. Section 4 also provides a detailed description of benefits. These provisions satisfy requirements (1) and (4). As previously quoted, Section 7 explains that “Liberty shall possess the authority, in its sole discretion, to construe the terms of the policy and to determine benefit eligibility hereunder.” This provision satisfies requirement (2). Section 7 also states: “This Policy may be changed in whole or in part. Only an officer of Liberty can approve a change. The approval must be in writing and endorsed on or attached to this policy.” This provision satisfies requirement (3). Thus, the LTD policy is the “written instrument” required under 29 U.S.C. § 1102(a)(1).

Denmark’s contention that GenRad did not explicitly delegate fiduciary responsibility to Liberty also fails. Although Liberty drafted the LTD policy, GenRad adopted the provisions of the policy — including the grant of discretionary authority to Liberty — by purchasing the LTD plan from Liberty. Moreover, the provisions of the LTD policy described supra unquestionably assign Liberty fiduciary responsibility for reviewing benefit claims. At least one court has held that identical language warrants arbitrary and capricious review “because Liberty is a fiduciary that the plan vests with discretionary authority.” Ellis v. Liberty Life Assurance Co., 394 F.3d 262, 266 & n. 3 (5th Cir.2004). Thus, the policy properly grants discretionary authority to Liberty and, consequently, the LTD benefit decision warrants arbitrary and capricious review under Firestone.

2. Effect of Structural Conflict of Interest

Firestone also held that “if a benefit plan gives discretion to an administrator or fiduciary who is operating under a conflict of interest, that conflict must be weighted as a ‘facto[r] in determining whether there is an abuse of discretion.’ ” 489 U.S. at 115, 109 S.Ct. 948 (quoting Restatement (Second) of Trusts § 187, cmt. d (1959)). Since Firestone, however, “courts have struggled to give effect to this delphic statement, and to determine both what constitutes a conflict of interest and how a conflict should affect the scrutiny of an administrator’s decision to deny benefits.” Pinto v. Reliance Standard Life Ins. Co., 214 F.3d 377, 383 (3d Cir.2000).

In Doyle v. Paul Revere Life Insurance Co., 144 F.3d 181, 184 (1st Cir.1998), we acknowledged that “a conflict of interest exists” where, as here, “any award of benefits would come out of [the insurer’s] own pocket.” However, we then observed that an insurer’s financial self-interest is balanced by an “important competing motive: having a benefit plan is to please employees, not to result in the employer’s bad reputation. Indeed, we venture that an employer would not want to keep an overly tight-fisted insurer. The conflict is not as serious as might appear at first blush.” Id. (citation omitted). We indicated that we would review cases of conflict with “more bite,” meaning that we would “adher[e] to the arbitrary and capricious principle, with special emphasis on reasonableness, but with the burden on the claimant to show that the decision was improperly motivated.” Id. In that regard, we concluded that merely showing that the insurer both reviewed and paid claims was not enough to warrant “more bite.” Id.

Subsequently, in Doe v. Travelers Insurance Co., 167 F.3d 53, 57 (1st Cir.1999), we limited the significance that might be imputed to the phrase “more bite,” explaining that “[t]he essential requirement of reasonableness has substantial bite itself where, as here, we are concerned with a specific treatment decision based on medical criteria and not some broad issue of public policy.” We reiterated that an insurer’s “general interest in conserving its resources is [not] the kind of conflict that warrants de novo review.” Id.

Our subsequent cases have read Doyle and Doe to stand for the proposition that “[t]he fact that ... the plan administrator ] will have to pay [the] claim out of its own assets does not change [the arbitrary and capricious] standard of review.” Glista v. Unum Life Ins. Co. of Am., 378 F.3d 113, 125-26 (1st Cir.2004) (citing Pari-Fasano v. ITT Hartford Life & Accident Ins. Co., 230 F.3d 415, 418-19 (1st Cir.2000)). We have continued to justify our adherence to arbitrary and capricious review in the face of such “structural” conflicts on the ground that “the market presents competing incentives to the insurer that substantially minimize the apparent conflict.” Pari-Fasano, 230 F.3d at 418. Recently, however, we acknowledged that “other circuits have rejected the market forces rationale and specifically recognized a conflict of interest when the insurer of an ERISA plan also serves as plan administrator, although there is no consistent approach in accordingly adjusting the standard of review.” Wright, 402 F.3d at 75 & n.5.

Prior to argument in this case, Denmark requested in a petition for initial hearing en banc that we reconsider this precedent. We denied the petition, and she now renews her argument that, where an insurer both reviews and pays claims, there is a “structural irreconcilable conflict of interest that requires an initial de novo review of an adverse benefit decision.” Liberty responds that the financial self-interest of an insurer who both reviews and pays claims is counterbalanced by the insurer’s desire to remain attractive to employers, and emphasizes the desirability of a simple procedure for reviewing denials of benefits.

The circuits have adopted varying approaches to the issue of whether the structural conflict that arises when an insurer both reviews and pays claims justifies less deferential review. In addition to this court, the Seventh and Second Circuits have held that a structural conflict alone is insufficient to alter the standard of review. Instead, these circuits require an actual showing that the conflict of interest affected the benefits decision before there will be any alteration in the standard of review. See Rud v. Liberty Life Assurance Co., 438 F.3d 772, 776-77 (7th Cir.2006)(holding that a structural conflict of interest, without more, does not affect the standard of review); Sullivan v. LTV Aerospace & Def. Co., 82 F.3d 1251, 1255-56 (2d Cir.1996)(holding that a claimant must show that a conflict of interest affected the benefits decision, but if such showing is made, de novo review applies).

However, seven other circuits have held that a structural conflict warrants alteration to the standard of review, although six of these circuits apply less deferential review within the arbitrary and capricious framework. Of these six circuits, all except one have adopted a “sliding scale” approach to the standard of review, in which the court applies less deferential review to the extent that a conflict of interest exists. See, e.g., Fought v. Unum Life Ins. Co. of Am., 379 F.3d 997, 1004 (10th Cir.2004)(per euriam)(explaining that “the court must decrease the level of deference given to the conflicted administrator’s decision in proportion to the seriousness of the conflict” (internal citation and quotation omitted)); Pinto, 214 F.3d at 379 (expressly adopting a “sliding scale method, intensifying the degree of scrutiny to match the degree of the conflict”); Vega v. Nat’l Life Ins. Servs., Inc., 188 F.3d 287, 297 (5th Cir.1999)(en banc)(explaining that “[t]he greater the evidence of conflict on the part of the administrator, the less deferential our abuse of discretion standard will be”); Woo v. Deluxe Corp., 144 F.3d 1157, 1161-62 & n. 2 (8th Cir.1998)(explic-itly adopting the sliding scale approach while noting that “not every funding conflict of interest per se warrants heightened review”); Doe v. Group Hosp. & Med. Servs., 3 F.3d 80, 87 (4th Cir.1993)(apply-ing less deference “to the degree necessary to neutralize any untoward influence resulting from the conflict”). The Ninth Circuit employs a “substantially similar” approach, but with a “conscious rejection of the ‘sliding scale’ metaphor” on the ground that “[a] straightforward abuse of discretion analysis allows a court to tailor its review to all the circumstances before it.” Abatie v. Alta Health & Life Ins. Co., 458 F.3d 955, 967-68 (9th Cir.2006)(en banc).

The Eleventh Circuit uses a different framework. It first determines, under de novo review, whether the decision was wrong; if it was, and if an inherent conflict of interest exists, “the burden shifts to the claims administrator to prove that its interpretation of the plan is not tainted by self-interest.” HCA Health Servs., Inc. v. Employers Health Ins. Co., 240 F.3d 982, 993-94 (11th Cir.2001). The claims administrator may then meet this burden “by showing that its wrong but reasonable interpretation of the plan benefits the class of participants and beneficiaries.” Id. at 994-95.

Finally, the D.C. Circuit has not yet established a standard of review in cases involving a structural conflict of interest. See Wagener v. SBC Pension Benefit Plan-Non Bargained Program, 407 F.3d 395, 402 (D.C.Cir.2005)(finding that the result would be the same under either arbitrary and capricious or de novo review).

This survey of the circuits reflects the difficulty of the standard of review issue. It remains an issue of considerable importance, as the number of amicus curiae briefs filed in this case demonstrates. However, this court’s prior decision in Doyle cannot be overruled by a panel of the court. Thus, we are bound by our precedent, and must apply arbitrary and capricious review in situations where the conflict of interest is purely structural.

Nevertheless, I think our standard of review in cases in which an insurer also makes benefits determinations is increasingly difficult to defend. In light of the development of the law since we first addressed the issue in Doyle, and because our position is now a minority view, I think it is time to reexamine the standard of review issue in an en banc proceeding. Although Judge Howard dissents from the judgment agreed to by Judge Selya and myself, he agrees with me, as indicated in his dissent, that we should reexamine the standard of review issue. In such a proceeding, I would be inclined to favor the “sliding scale” approach explicitly adopted by the Third, Fourth, Fifth, Eighth, and Tenth Circuits. Under this approach, “ ‘the fiduciary decision will be entitled to some deference, but this deference will be lessened to the degree necessary to neutralize any untoward influence resulting from the conflict.’ ” Pinto, 214 F.3d at 391 (citation omitted).

3. Other Evidence of Conflict

This panel must also consider, however, whether other evidence, in addition to this structural conflict, warrants alteration of the standard of review. In her broadest argument, Denmark asserts that “[t]he District Court should have reviewed the entire Liberty analysis with no less than a heightened standard of review, because the entire claims process ... was infected with conflict.” In support of this argument, she contends that Liberty chose to rely on unreliable medical opinions and improperly imposed a requirement that she produce objective evidence of her inability to work, thereby demonstrating its bias against her.

This argument is essentially another request that we reject our precedents in Doyle and Doe. The evidence to which Denmark refers is part of the record on which Liberty based its benefits decision, and is thus properly evaluated under the “essential requirement of reasonableness” that we apply to that decision, Doe, 167 F.3d at 57, rather than as evidence of a conflict of interest. As the district court acknowledged, Wright entertained — and rejected — a plaintiffs claim that there was a conflict of interest based on certain factual circumstances surrounding the claim. 402 F.3d at 76. Here, however, the evidence in question falls within the scope of the reasonableness analysis, and this opinion will consider it using that framework.

Before the district court, Denmark made one additional attempt to demonstrate a conflict of interest. She filed a motion for discovery of information relating to the financial relationship between Liberty and NMR, arguing that such information would help her establish that Liberty referred claims to NMR for reasons of financial self-interest. The district court granted the motion and ordered Liberty to produce information about how much money it paid NMR, answer Denmark’s interrogatory as to how many files it referred to NMR, and stipulate the number of cases in which NMR has accepted a claim. Liberty produced information showing that between 2001 and 2003 it had paid $2,004,656 and referred 1,204 files to NMR, but stated that it was unable to make the stipulation described by the district court “due to the very substantial burden and expense that would be involved in retrieving and manually reviewing the over 1,200 claims files that were referred to NMR physicians from 2001 to 2003.”

In its order granting summary judgment to Liberty, the district court interpreted “the First Circuit’s reluctant concurrence with past precedent in Wright as a suggestion that additional evidence of the ERISA plan administrator’s efforts to maximize profits could be enough to turn apparent conflict into real conflict.” Moreover, it stated that the discovery Denmark requested was appropriate because “[a] claimant cannot meet the burden of demonstrating a conflict of interest if she cannot supplement the record with relevant evidence.” To sanction Liberty for its refusal to comply with this discovery order, the district court determined that it would “draw the inference suggested by the Plaintiff, namely that NMR has not found in favor of a single claimant in connection with the 1,204 Liberty files referred to NMR during the years 2001 — 2003.” The court concluded that, if NMR never found in favor of claimants in its evaluations, Liberty would have a “ ‘clear incentive’ to contract with NMR to obtain peer reviews that support denial of disability benefits, justifying a review with ‘more bite.’ ” However, the court also determined that to “calibrate the sanction” it would apply the heightened standard of review only to Liberty’s reliance on Dr. Bomalaski’s opinion because no other evidence in the record would have been affected by the conflict with NMR.

On appeal, Denmark does not advance the argument that, under this court’s existing precedent, the evidence of Liberty’s relationship with NMR would, in combination with the existing structural conflict, justify some degree of heightened review not only of Dr. Bomalaski’s opinion but also of the entire record. Liberty, however, protests both the discovery that the district court allowed and the sanction that the court imposed for its refusal to comply. It urges that we disregard the discovery admitted into the record and reject the heightened standard of review that the district court applied to Dr. Bomalaski’s opinion.

In light of Denmark’s silence on the link between the particularized conflict evidence and an altered standard of review for the entire record, we need not decide whether such alteration of the overall standard of review would have been appropriate under our precedents suggesting that proof of an actual conflict (rather than a structural conflict) might justify review with “more bite.” However, the panel has divided on its response to the sanction imposed by the district court for Liberty’s refusal to comply with the discovery order. Judge Selya concludes that the sanction was a proper exercise of the district court’s authority. Judge Howard does not reach the sanction issue because his view of the case does not depend on whether a higher level of scrutiny is applied to the report of Dr. Bomalaski. In my view, the district court erred in applying heightened review to Dr. Bomalaski’s opinion as a sanction for failure to comply with the discovery it orders. This piecemeal approach to the standard of review is not supported by any precedents that I could find.

I think that the district court’s implementation of its piecemeal approach demonstrates the problem with such an approach. It first applied heightened review to conclude that Dr. Bomalaski’s opinion was “not fully supported, although still relevant.” It then factored its skeptical assessment of Dr. Bomalaski’s opinion into the overall arbitrary and capricious analysis, essentially running Dr. Bomalaski’s opinion through two levels of analysis. Although arguably manageable as a mode of analysis here because this case involves only one expert’s opinion, one can easily imagine cases in which this piecemeal approach would be unmanageable if conflicts affected numerous, discrete portions of the record. I cannot approve such an unwieldy approach to the standard of review. Therefore, in applying that arbitrary and capricious standard of review as articulated in Doyle and Doe to the administrative record, I will not apply heightened review to the report submitted by Dr. Bomalaski.

Having found no reason to depart from the arbitrary and capricious standard, I turn to the administrative record, applying that standard as articulated in Doyle and Doe.

III.

Under arbitrary and capricious review, this court will uphold an administrator’s decision to deny benefits to a beneficiary if the decision was reasoned and supported by substantial evidence. Wright, 402 F.3d at 74. Evidence is substantial when it is reasonably sufficient to support a conclusion, and contrary evidence does not make the decision unreasonable. Id. While arbitrary and capricious review is not the equivalent of a “ ‘rubber stamp,’ ” Lopes v. Metro. Life Ins. Co., 332 F.3d 1, 5 (1st Cir.2003) (citation omitted), “ ‘a court is not to substitute its judgment for that of the [decisionmaker],’ ” Terry v. Bayer Corp., 145 F.3d 28, 40 (1st Cir.1998) (citation omitted).

Thus, I will review the LTD benefits denial by examining in turn Liberty’s review of the medical opinions, its insistence on objective evidence, its reliance on the surveillance evidence, and its refusal to consider the favorable Social Security determination. After such review, I conclude that Liberty’s denial of benefits was supported by substantial evidence.

A. Review of Medical Evidence

In support of her claim that Liberty’s decision was not supported by substantial evidence, Denmark first argues, as a general proposition, that Liberty improperly based its decision on the opinions of medical personnel who did not examine her. She then discusses the reports of these medical personnel on an individual basis, arguing that these opinions were unreliable. Liberty counters that the opinions were consistent with the claims file and that, consequently, it could rely on them.

I. Nonexamining Medical Personnel

Denmark objects to Liberty’s reliance on the opinions of Nurse Kaye and Dr. Miller over the opinions of three doctors who conducted physical examinations: Dr. Malanoski, Dr. Goodman, and Dr. Schur. She relies on Calvert v. Firstar Finance, Inc., 409 F.3d 286, 295 (6th Cir.2005), in which the court found that while “reliance on a file review does not, standing alone, require the conclusion that [the insurer] acted improperly, we find that the failure to conduct a physical examination ... may, in some cases, raise questions about the thoroughness and accuracy of the benefits determination.”

However, the Supreme Court held in Black & Decker Disability Plan v. Nord, 538 U.S. 822, 834, 123 S.Ct. 1965, 155 L.Ed.2d 1034 (2003), that courts “have no warrant to require plan administrators automatically to accord special weight to the opinions of a claimant’s physician; nor may courts impose on such administrators a discrete burden of explanation when such administrators credit reliable evidence that conflicts with a treating physician’s evaluation.” Moreover, as Calvert acknowledges, the absence of a physical exam is not determinative. 409 F.3d at 295. Finally, in Gannon v. Metropolitan Life Insurance Co., 360 F.3d 211, 214 (1st Cir.2004), this court noted that “we have treated a nonexamining physician’s review of a claimant’s file as reliable medical evidence on several occasions.” Thus, I conclude that it was permissible for Liberty to rely on the reports of nonexamining physicians over those of examining physicians.

2. Nurse Kaye’s Opinion

In addition to her objection to Nurse Kaye’s nonexamining role, Denmark argues that Liberty should not have relied on the opinion of a nurse over the opinions of doctors. Moreover, she contends that Kaye made an improper

Denmark directs this panel to Gellerman v. Jefferson Pilot Financial Co., 376 F.Supp.2d 724, 735 (S.D.Tex.2005), in which one district court found that “the level of deference due nurses should generally be less than that extended to doctors whose professions in the relevant field.” Although that district court admitted that little precedent supports this exact proposition, “other courts, at least in the context of determining medical necessity, have been wary of giving nurses broad deference.” Id. at 735-36 (citing C.N.S., Inc. v. Conn. Gen. Life Ins. Co., 9 F.Supp.2d 194, 198 (E.D.N.Y.1998); Pritt v. United Mine Workers of Am., 847 F.Supp. 427 (S.D.W.Va.1994)).

I conclude, contrary to Denmark’s claims, that it was not improper for Liberty to rely on Nurse Kaye’s report the probative value of Dr. Schur’s opinion. On April 12, 2002, as part of the STD appeal, Dr. Schur evaluated Denmark and found that “at least for the time being, she is clearly disabled not only from work, but from being able to take care of her household.” Denmark’s application for LTD benefits in June 2002 then began a new phase of the process, and Denmark failed to submit any additional medical during this part of the process. Evaluating Denmark’s LTD claim on 8, Nurse Kaye found that “inferences into the status of [the Plaintiffs] conditions 6 months previous cannot be accurately assessed.” Although in fact only four months had elapsed, Dr. Schur’s own on his opinion to “the time being” Denmark’s failure to submit additional medical reports resulted in a lack of evidence showing that, on an ongoing basis, Denmark was “unable to perform the Material and Substantial Duties” of her own occupation. Consequently, I find that Nurse Kaye’s assessment was not unreasonable.

Denmark also contends that Liberty should not have relied on Kaye’s opinion due to an apparent credibility determination that Kaye made in her review of Denmark’s file. Denmark cites Smith v. Continental Casualty Co., 450 F.3d 253, 26364 (6th Cir.2006), in which the court found that an insurer’s reliance on a credibility determination by a nonexamining physician, in conjunction with its refusal to seek an independent medical examination, supported a finding that the decision was arbitrary.

In the claim log, Kaye observed: “[Denmark] may be self-limiting her work or social activities.... Barriers to recovery are clmnt’s perception that she is unable to work & is seeking permanent benefits.” However, this apparent credibility determination was only one of many statements by Kaye evaluating Denmark’s record. There is no indication that this statement dictated the outcome of the benefits decision. Kaye also thoroughly reviewed each piece of evidence in Denmark’s file and requested a peer review “to fully define” Denmark’s condition and to “determine if the [restrictions/limitations] are supported.” This review evinces a full evaluation of the evidence in the record.

3. Dr. Miller’s Opinion

Denmark challenges Liberty’s reliance on the opinion of Dr. Miller on the grounds that he did not examine her, that he objective evidence, of her condition, and that he did not consider her entire medical record. For the reasons discussed supra, Dr. Miller’s evaluation is not simply because he did not examine Denmark. And, as explained infra, the emphasis on objective evidence of inability to work does not render Dr. Miller’s opinion unreliable. Finally, Denmark’s only evidence that Dr. Miller did not examine her entire medical file was his fax report accompanied by only three pages of medical records. However, this fact alone does not demonstrate that he did not also consider the remainder of her file; in fact, his report refers to several documents in Denmark’s file that were not among those that accompanied the fax.

4. Dr. Bomalaski’s Opinion

Denmark contends that Dr. opinion was unreliable because he did not identify the records that he and because his conclusion that she could work full time in her own occupation did not follow from surveillance evidence documenting her activities for a few hours at a time. Liberty counters that Dr. opinion was based on both the surveillance evidence and the medical in the file.

Dr. Bomalaski’s opinion specifically refers to the opinions of Drs. Malanoski, Goodman, Taylor and Schur, the Functional Capacities Form, and the surveillance evidence, indicating that he examined these records. He first found that “[t]he clinical medical evidence does not clearly support severe impairment because as noted the diagnosis of fibromyalgia remains in question not only by this reviewer but also by Dr. Schur.” Given that Liberty does not contest the diagnosis of fibromyalgia, this statement is largely irrelevant.

More importantly, Dr. Bomalaski relied on the surveillance evidence to conclude that Denmark could engage in activities such as sitting, standing, walking, driving, reading, and lifting up to twenty pounds “occasionally,” meaning “up to 1/3 of the time,” and observed that she is “able to lift a gallon of water (10 pounds) with one hand and place in car from grocery cart.” Dr. Bomalaski concluded that “Denmark is capable of working full time in a primarily sedentary position within the limitations and restrictions noted on the Functional Capacities Form.” His characterization of her occupation as “primarily sedentary” is supported by the Labor Market Survey’s description of the. physical demands as “sedentary to light.” Moreover, while his conclusion that she can perform the duties of her occupation does not follow necessarily from the surveillance evidence, it is not contradicted by objective evidence of Denmark’s functional limitations.

B. Requirement of Objective Evidence of Inability to Work

Denmark also contends that it was unreasonable for Liberty to require her to produce objective evidence of her inability to work. In evaluating the STD claim, Kaye sought a peer review from Dr. Miller asking whether “the accompanying documentation provide[s] objective findings that would indicate a significant change in condition that was evident on or about the date of the disability (10/3/2001)[.]” Dr. Miller responded that “[t]here are no documented objective physical exam findings that support a decrease or significant change in this patient’s physical condition.” His conclusion is quoted and paraphrased throughout Liberty’s claim log as well as in the letters denying Denmark’s STD and LTD benefits claims.

We have previously found it for an insurer to require objective evidence to support a diagnosis of a condition that is not subject to verification through laboratory testing. See Cook v. Liberty Life Assurance Co., 320 F.3d 11, 21 (1st Cir.2003)(explaining that it was for an insurer to require a claimant to provide “clinical objective” that she was suffering from chronic fatigue syndrome). The district court found that fibromyalgia is also a condition that is not subject to objective verification because, as Dr. Malanoski it is a “condition lacking in blood testing or specific in physical exam.” In accordance with this precedent, Liberty did not base its denial of LTD benefits on reservations about Denmark’s diagnosis of fibromyalgia.

However, this court draws a distinction between requiring objective evidence of the diagnosis, which is impermissible for a condition such as fibromyalgia that does not lend itself to objective verification, and requiring objective evidence that the is unable to work, which is allowed. In Boardman v. Prudential Insurance Co. of America, 337 F.3d 9, 17 n. 5 (1st Cir.2003), we held that “[w]hile the diagnoses of chronic fatigue syndrome and fibromyalgia may not lend themselves to objective findings, the physical limitations by the symptoms of such illnesses do lend themselves to objective analysis.” We concluded:

None of the specialists that have treated Ms. Boardman in the past two years have indicated any limitations or restrictions, based on objective findings, that would preclude Ms. Boardman from performing any occupation for which she is suited. Therefore, we have determined that Ms. Boardman does not meet the definition of disability as required by the policy.

Id. at 16-17. Thus, it fell within the parameters defined in Boardman for Liberty to require Denmark to provide objective evidence of functional limitations or restrictions that would prevent her from working.

Denmark attempts to distinguish Boardman on the ground that, unlike the plaintiff in that case, she submitted evidence of impairment rather than mere conclusions from the treating physician. Moreover, she argues that, unlike the IME in Boardman, the IME in this case agreed with the conclusions of the treating physician. Dr. Schur stated:

Her muscle tone is very poor throughout; she has about 30 tender points all over, including her hands, wrists, elbows, around her shoulders, back and neck, upper back, middle and low back, as well as lower extremity.... [S]he has decreased abduction in both shoulders, more on the right than on the left, and decreased rotation in both shoulders, some decreased rotation in both hips. I cannot do straight leg raising on either side because of back discomfort.

Denmark argues that Dr. Schur’s findings provide the objective evidence that was lacking in Boardman. However, while Denmark is correct that Dr. Schur’s does provide some objective support for her physical restrictions, it does not demonstrate objectively her inability to work because the evaluation does not relate those restrictions either to the specific physical requirements of her job or to her overall stamina. Liberty’s emphasis on objective evidence was not impermissible.

C. Reliance on Surveillance Evidence

Denmark also argues that Liberty relied unreasonably on the surveillance it obtained. She dismisses the evidence against her as emphasizing that she was never out of her house for more than a few hours at a time and that on two of the four days she did not leave her house at all.

surveillance directly contradicted Denmark’s limitations in her activities She was observed driving for up to twenty minutes at a time, when her activities questionnaire had reported that she could drive for only five minutes at a time. She was also observed lifting and carrying groceries from the store to her car and from her car to her house, when her activities questionnaire had reported that she needed help with these tasks.

I conclude that Liberty was entitled to consider the surveillance evidence along with the other medical opinions and that it had at its disposal. Liberty itself acknowledged the limitations of the evidence, explaining that the investigator “didn’t have a video camera to document [Denmark’s] activities only a photo camera which can’t detail how fluid her movements were and how long she was doing the activities.” However, Denmark does not contest that she performed the activities described by the surveillance investigator, and it was not improper for Liberty to use the investigator’s report and photographs in making the LTD benefit determination.

D. Refusal to Consider Social Security Determination

Finally, Denmark argues that Liberty should have reconsidered her claim in light of her favorable decision from the Social Security Administration (“SSA”). As the district court correctly noted, “[although the SSA’s determination of a claimant’s entitlement to social security disability benefits is not binding on disability insurers, it can be relevant to an insurer’s determination whether that claimant is eligible for disability benefits.” See Gannon, 360 F.3d at 215.

However, the district court also correctly concluded that the timing in this case forecloses Denmark’s argument. Liberty initially decided Denmark’s LTD claim on August 20, 2002. It then denied Denmark’s appeal in a letter dated December 10, 2002, which stated that Denmark’s “administrative right of review has been exhausted and no further review will be conducted by Liberty.” On January 13, 2004, more than thirteen months later, the SSA issued its decision for Denmark.

On appeal, Denmark relies heavily on Calvert, 409 F.3d 286, in which the Sixth Circuit held that an insurer’s refusal to consider a Social Security benefits decision contributed to a finding that its decision was arbitrary and capricious. In Calvert, however, the SSA’s decision was issued more than two years before Liberty denied the claimant’s appeal. We have observed that “in order to find that an insurer has abused its discretion under the contract, we would have to conclude that the insurer’s eligibility determination was unreasonable in light of the information available to it” when it made its decision. Pari-Fasano v. ITT Hartford Life & Accident Ins. Co., 230 F.3d 415, 419 (1st Cir.2000); see also Orndorf, 404 F.3d at 519 (“[T]he final administrative decision acts as a temporal cut off point. The claimant may not come to a court and ask it to consider post-denial medical evidence in an effort to reopen the administrative decision.”). Because the SSA decision was not available to Liberty when it made its decision, Liberty cannot be faulted for failing to factor the SSA decision into its final decision.

IV.

Applying arbitrary and capricious review, I conclude that Liberty’s benefits determination was supported by substantial evidence. Liberty reasonably relied on the opinions of three nonexamining medical personnel: Nurse Kaye, Dr. Miller, and Dr. Bomalaski. These reports, in conjunction with the lack of objective evidence to support Denmark’s inability to work, and the surveillance evidence, provide substantial evidence to support Liberty’s decision to deny Denmark’s claim for LTD benefits. Finally, I cannot fault Liberty for refusing to reconsider its decision to deny benefits in light of the subsequent SSA decision.

Although Judge Selya does not join my analysis of the administrative record, he agrees with my conclusion that substantial evidence in that record supports Liberty’s decision. Thus, we affirm the district court’s determination that Liberty’s decision to deny Denmark’s claim for long term disability benefits was neither arbitrary nor capricious.

Affirmed.

SELYA, Circuit Judge,

concurring in the judgment.

I agree with Judge Lipez’s meticulous analysis of the relevant medical evidence and with the lead opinion’s conclusion that the judgment rendered by the district court — a judgment rendered after an equally thoughtful study — should be affirmed. For the reasons elucidated by Judge Lipez and by the court below, the benefits determination made by Liberty, qua plan fiduciary, was within the universe of plausible outcomes. Consequently, that determination was neither arbitrary nor capricious. See, e.g., Leahy v. Raytheon Co., 315 F.3d 11, 21 (1st Cir.2002) (upholding plan fiduciary’s benefits determination, which-, “though not inevitable, was solidly grounded”).

Despite my admiration for the lead opinion’s synthesis of the medical evidence, I choose not to join it. I write separately to identify the two areas of concern that prompt this course of action.

First — and in the overall scheme of things less important — I disagree with the lead opinion’s treatment of the sanction imposed by the district court. While I must confess that the sanction — reading the report of Dr. John Bomalaski with an increased level of scrutiny — is an unorthodox one, I do not think that the sanction goes beyond the limits of the district court’s authority.

It is black-letter law that district courts have wide authority to fashion sanctions that are appropriate to the circumstances. See, e.g., Santiago-Diaz v. Laboratorio Clinico y De Referencia Del Este, 456 F.3d 272, 275 (1st Cir.2006); Rosario-Diaz v. Gonzalez, 140 F.3d 312, 315 (1st Cir.1998). The more unusual the circumstances of the misconduct, the more a need for inventiveness may arise. See generally Tower Ventures, Inc. v. City of Westfield, 296 F.3d 43, 46 (1st Cir.2002) (encouraging case-by-case review of choice of sanction because “the circumstances attendant to noncompliance are apt to differ widely”). Busy district courts must be able to control their dockets and to manage burgeoning caseloads effectively. Thus, when the choice of an appropriate sanction is at issue, orthodoxy for its own sake ought not to be required.

The sanction imposed here, though curious, was custom-tailored to fit a unique set of circumstances and to offset the perceived effects of an idiosyncratic discovery violation. Although I would not recommend this form of sanction for everyday use, I am not prepared to say that it falls outside the extensive armamentarium of sanctions available to a federal district court. I would, therefore, unlike Judge Lipez, uphold the sanction.

Having made this point, I eschew any further elaboration. The situation is sui generis, and Judge Lipez’s refusal to enforce the sanction — while mistaken in my view — is of no consequence in the long run. Either way — that is, with or without effectuating the sanction — the district court’s affirmance of the benefits-denial determination merits our approbation.

This brings me to a more important, more global area of concern. I wish to disassociate myself from the lead opinion’s survey of the standard-of-review cases (which I find more nuanced than that opinion indicates) and from its conclusion that our own standard of review, most clearly set forth in Doe v. Travelers Insurance Co., 167 F.3d 53, 56-57 (1st Cir.1999), requires reexamination en banc. The Doe standard, in my view, is not only correct but also sufficiently flexible to permit us to take into account singular factors and circumstances that might heighten conflict-of-interest concerns in a particular case. See, e.g., id. at 57 (terming the requirement that the fiduciary’s determination be “reasonable” as “the basic touchstone” of our standard of review). Indeed, we have recognized that, in developing this standard of review in Doe and its lineal ancestor, Doyle v. Paul Revere Life Insurance Co., 144 F.3d 181, 184 (1st Cir.1998), “we took into account the potential for’ conflict in considering whether the insurer’s decision had strayed outside the bounds of reasonableness to become an abuse of discretion.” Pari-Fasano v. ITT Hartford Life & Accid. Ins. Co., 230 F.3d 415, 419 (1st Cir.2000). I have no desire to revisit this well-ploughed terrain.

I need not wax longiloquent. It suffices to say that the Doe standard has served us well. I continue to believe, as I remarked in a different context, that “if it ain’t broke, don’t fix it.” United States v. Natanel, 938 F.2d 302, 310 (1st Cir.1991) (quoting Lawrence “Yogi” Berra). In my judgment, our court would do well to heed that piece of folk wisdom here.

If more is needed — and I doubt that it is — en banc proceedings tend to be notoriously wasteful of scarce judicial resources. There seems to me to be little point in trading a workable and battle-tested standard of review for yet another plunge into the vortex of en banc consideration — a plunge that threatens to splinter the court and to make the standard of review less transparent. Even if one assumes, for argument’s sake, that our standard of review could profit from an attempted clarification with respect to “structural conflict” cases, the unevenness in the decisions of the various courts of appeals strongly suggests that any such undertaking should be left to the Supreme Court (when and if the Justices deem the time propitious).

With these brief comments, I concur in the judgment of the court..

HOWARD, Circuit Judge,

dissenting.

For essentially the reasons stated by Judge Lipez, I agree that we ought to reexamine our approach to reviewing an administrator’s denial of benefits under a plan governed by the Employee Retirement Income Security Act, 29 U.S.C. § 1001 et seq. (“ERISA”), when those benefits stand to be paid out of the administrator’s own assets. I think, however, that, even under the standard of review now in place, we should reverse the decision by Liberty Life Assurance Company (“Liberty”) to deny long-term disability benefits to Diane Denmark. In my view, the denial “was ‘unreasonable in light of the information available’ to Liberty at the time of its decision.” Cook v. Liberty Life Assurance Co., 320 F.3d 11, 18 (1st Cir.2003) (quoting Pari-Fasano v. ITT Hartford Life & Acc. Ins. Co., 230 F.3d 415, 419 (1st Cir.2000)). As my view does not depend on whether a higher level of scrutiny is applied to the report of the outside physician who reviewed Denmark’s file for Liberty, Dr. John Bomalaski, see infra note 4, I do not reach whether the district court permissibly applied a higher level of scrutiny as a sanction for Liberty’s refusal to provide discovery on whether Dr. Bo-malaski’s employer routinely approved Liberty’s benefit decisions.

As Judge Lipez’s lead opinion observes, Liberty does not dispute that Denmark has fibromyalgia, but only that the condition rendered her disabled under the terms of the plan when she stopped working in October 2001. Around that time, Denmark’s treating physician rated her as having a “severe limitation of functional capacity” and being “incapable of physical activity,” while her rheumatologist called her “disabled by exhaustion and myalgia which makes it difficult for her to stay at work for any appreciable amount of time.” At her employer’s insistence, Denmark then obtained a third opinion, in the form of an independent medical examination conducted by Dr. Peter Schur (“the IME”). Schur found that Denmark was “clearly disabled, not only from work, but from being able to take care of her household.” While he expressed hope that changes to Denmark’s drug regime and increased exercise “will improve matters so that she can get her stamina back and get back to work,” he concluded that “until that is accomplished, which may take months, she is clearly disabled.”

In spite of this consensus among the doctors who had examined Denmark, Liberty concluded that she was not, in fact, disabled because “there was no significant change in [her] condition” on the date she stopped working. This'rationale, however, bears no relationship to any requirement for long-term disability benefits imposed by the plan itself. Instead, it amounts to an argument that, because Denmark has suffered from fibromyalgia since at least 1996 but continued working nonetheless, she cannot rely on fibromyalgia as a reason for not working in 2001 unless she can show that her condition has worsened. The argument does not survive even minimal scrutiny.

As a number of other courts have recognized, there is no “logical incompatibility between working full time and being disabled from working full time. A desperate person might force himself to work despite an illness that everyone agreed was totally disabling. Yet even a desperate person might not be able to maintain the necessary level of effort indefinitely.” Hawkins v. First Union Long-Term Disability Plan, 326 F.3d 914, 918 (7th Cir.2003) (citations omitted); accord Seitz v. Metro. Life Ins. Co., 433 F.3d 647, 651 (8th Cir.2006); Lasser v. Reliance Std. Life Ins. Co., 344 F.3d 381, 392 (3d Cir.2003); Marecek v. BellSouth Telecomms., Inc., 49 F.3d 702, 706 (11th Cir.1995). In addition to its appeal as a matter of human experience, this reasoning also furthers an important policy objective, namely that “[a] disabled person should not be punished for heroic efforts to work by being held to have forfeited his entitlement to disability benefits should he stop working.” Hawkins, 326 F.3d at 918.

Rather than engaging this line of authority here, the lead opinion sidesteps the issue, concluding that, because Denmark herself asserted that her condition had worsened at the time she stopped working, Liberty was entitled to rely on her failure to support that assertion in denying her claim. Under the plan, however, Denmark’s entitlement to long-term disability payments does not depend on whether her fibromyalgia had intensified around the time she stopped working. It depends on whether she was then “unable to perform the Material and Substantial Duties of [her] Own Occupation.” Denmark’s treating physicians, as well the independent doctor hired by her employer, all concluded that she was. Liberty could not have reasonably reached the opposite conclusion based on Denmark’s failure to prove a fact that is inessential, and logically unconnected, to her contractual right to long-term disability benefits.

The lead opinion also downplays the importance of this factor to Liberty’s ultimate decision, focusing instead on the asserted absence of objective evidence of Denmark’s inability to work. But the lack of a change in Denmark’s condition at the time she stopped working was the sole explanation the insurer gave for denying both short-term and long-term disability benefits. In any event, Denmark presented objective evidence of her inability to work in the form of the IME, where Dr. Schur observed “tender points all over,” a decreased range of motion in the shoulders and hips, and “back discomfort” during an extensive physical examination.

The lead opinion treats this as “some objective evidence of [Denmark’s] physical restrictions,” but concludes that “it does not demonstrate objectively her inability to work because the evaluation does not specifically relate those restrictions either to the physical requirements of the job or to her overall stamina ” (emphasis added). As this passage recognizes, a condition that saps a claimant of her energy would make her unable to perform the duties of nearly any occupation. See, e.g., Rose v. Shalala, 34 F.3d 13, 19 (1st Cir.1994). Dr. Sehur’s report characterizes Denmark’s fi-bromyalgia in precisely that way: he endorses a course of treatment “to try and cut down on pain and help her sleep ... so that she isn’t so tired and, hopefully, can get some of her stamina back,” but opines that, “until that is accomplished, which may take months, she is clearly disabled not only from work, but from being able to take care of her household.” The IME therefore expressly relates the objective manifestations of Denmark’s condition— the pain Dr. Schur observed in various points throughout her body — to her inability to work.

In light of this diagnosis, our decision in Boardman v. Prudential Ins. Co. of Am., 337 F.3d 9 (1st Cir.2003), does not support the outcome reached by the majority. In Boardman, we upheld the insurer’s denial of long-term disability benefits to the claimant based on two factors: “(1) the absence of adequate evidence in [her] medical records indicating that [her] condition imposed limitation on her ability to perform the material and substantial duties of her own occupation ... and (2) the evidence to the contrary provided in the [IME] reports.... ” Id. at 17. Here, in stark contrast, the IME unequivocally concludes, based on objective observations, that Denmark is disabled, and that conclusion squares with the opinions of her treating physicians.

Liberty’s contrary finding, then, much more closely resembles the decision we overturned in Cook. There, as here, the physician who had examined the claimant opined that she was disabled from fatigue, but Liberty relied on purported defects in the opinion — identified by an in-house analyst who had not examined the claimant — to conclude otherwise. 320 F.3d at 20-23. In criticizing this approach, we suggested that an insurer who doubts a treating doctor’s diagnosis ordinarily should seek a second opinion, either in the form of peer review or an IME. Id. at 23. Here, of course, the IME confirmed the diagnoses of Denmark’s treating physicians. So Liberty reached the contrary conclusion based on purported defects in the IME identified by an in-house analyst who had not examined Denmark — using, in essence, the same approach we took such a dim view of in Cook.

Furthermore, Liberty’s efforts to discount the IME do not hold up, even under our deferential standard of review. In seeking to diminish Dr. Schur’s conclusion that Denmark is disabled, Liberty’s in-house nurse argued that “since an IME provides an examination on a specific date in time, its scope is limited on that inferences to the status of conditions 6 months previous cannot be accurately assessed.” Under this logic, however, a medical examination — which, by its nature, always occurs “on a specific date in time” — can never “accurately assess” the patient’s condition prior to having walked into the examination room. Either that premise is demonstrably false, or a patient is subjected to a pointless exercise every time she undergoes an IME. Cf. Cook, 320 F.3d at 23. Liberty’s reasons for spurning Dr. Schur’s opinion here, then, are no more supportable than its reasons for spurning the treating doctor’s opinion in Cook.

I recognize, as we did in Cook, the possibility of “cases where the opinion of the claimant’s treating physician can be rejected without reliance on any contradictory medical evidence developed by the plan administrator.” 320 F.3d at 23. But rejecting the opinions of the claimant’s treating physicians, and the corroborating view of an independent medical examiner, ought to require a considerably stronger justification than the one relied on here. Accordingly, I would overturn Liberty’s denial of long-term disability benefits to Denmark, even under our present standard of review. 
      
      . Denmark’s appeal concerns only the denial of her LTD claim. This opinion recounts the history surrounding the STD claim because Liberty considered this information in denying the LTD claim.
     
      
      . The claim record indicates that this letter was not considered as part of the original STD benefits determination. However, Liberty did consider the letter as part of its LTD benefits determination.
     
      
      . The distinction between objective evidence of fibromyalgia itself and objective evidence of the functional deficits resulting from the disease is an important issue that will be discussed infra.
      
     
      
      . The record indicates that a gallon of water weighs about eight pounds.
     
      
      . The Network Medical Review refers claims to physicians who evaluate the functional abilities of a claimant based on medical information provided.
     
      
      . The district court correctly noted that the appropriate civil enforcement provision for Denmark's benefits claim was 29 U.S.C. § 1132(a)(1)(B) only, not 29 U.S.C. §§ 1132(a)(2) and (3).
     
      
      . The court did apply heightened review to Dr. Bomalaski’s opinion as a sanction for Liberty's refusal to comply with a discovery order pertaining to its financial relationship with NMR, Dr. Bomalaski’s employer. The court’s treatment of this evidence will be discussed in detail below.
     
      
      . Leahy left open the possibility that under certain circumstances a district court might take evidence in an ERISA case. 315 F.3d at 18 n. 6.
     
      
      . The Court justified its holding by explaining that, in interpreting ERISA, courts "are guided by principles of trust law.” Firestone, 489 U.S. at 111, 109 S.Ct. 948. Consequently, "[tjrust principles make a deferential standard of review appropriate when a trustee exercises discretionary powers.” Id. (citing Restatement (Second) of Trusts § 187 (1959)("Where discretion is conferred upon the trustee with respect to the exercise of a power, its exercise is not subject to control by the court except to prevent an abuse by the trustee of his discretion.”)).
     
      
      . Denmark cites Rodriguez-Abreu v. Chase Manhattan Bank, 986 F.2d 580 (1st Cir.1993), to support her contention of improper delegation, but that case is distinguishable. There, we applied de novo review because the plan granted discretionary authority to a named fiduciaiy, rather than to the plan administrator, but the plan administrator made the final claim determination. Id. at 582-84. Here, by contrast, the plan explicitly granted discretionary authority to Liberty, and Liberty also made the final decision on Denmark’s claim.
     
      
      . The district court took a different approach to this issue, finding that the improper delegation argument “miss[es] the point” because Liberty "acted as a de facto administrator and fiduciary” throughout the claim determination process and that, consequently, it was a fiduciary within the meaning of ERISA. Because GenRad’s adoption of the LTD policy granted discretionary authority to Liberty, it is unnecessary to consider the merits of this de facto analysis.
     
      
      . As previously explained, GenRad was the statutory plan administrator for the LTD plan, but it delegated the benefits determinations to Liberty, making Liberty a fiduciary that is responsible for paying claims out of its own assets. Thus, Liberty faces the structural conflict of interest described in Doyle and Doe.
      
     
      
      . The Massachusetts Employment Lawyers Association, AARP, and United Policyholders have filed briefs in support of Denmark. The American Council of Life Insurers has filed a brief in support of Liberty.
     
      
      . I think that the novelty of the district court’s piecemeal approach to the standard of review demonstrates an understandable uncertainty in the district court, under our current law, about what is permitted in the way of discovery to establish a conflict of interest that goes beyond a structural conflict, and what should be done when such a conflict exists. This uncertainty emphasizes further the need for en banc consideration of the standard of review issue in this type of ERISA case.
     
      
      . The dissent argues, based on our decision in Cook v. Liberty Life Assurance Co., 320 F.3d 11 (1st Cir.2003), that Liberty’s rejection of the opinions of Drs. Malanoski, Goodman, and Schur is particularly suspect because the treating physicians and the IME were in agreement. Part of the basis for our decision in Cook was the insurer's failure to seek an IME or subject the claimant’s file to review by another physician. Id. at 23. However, we did not hold that, where an IME agrees with a treating physician, an insurer cannot reach an opposite conclusion as to the claimant’s disability status. In fact, we stated that "[tjhere may well be cases where the opinion of the claimant’s treating physician can be rejected without reliance on any contradictory medical evidence developed by the plan administrator.” Id. Moreover, Dr. Schur’s IME report was submitted as part of the STD appeals process, and Denmark submitted no additional medical reports during the subsequent LTD benefits determination.
     
      
      . Following the denial of Denmark’s LTD benefits appeal, Drs. Goodman and Hack each submitted a detailed four-page Residual Functional Questionnaire” for Denmark's Social Security benefits deterand mination. The questionnaires addressed some of the functional limitations caused by Denmark’s condition. Such information might have been helpful to Denmark if it had been submitted during the LTD process.
     
      
      . The dissent says that this opinion has failed to address the issue of whether Liberty improperly required Denmark to show that her condition changed on or around her date of disability, contending that neither case law nor the LTD policy supports such a requirement I have not ignored that issue, and agree that "[a] disabled person should not be punished for heroic efforts to work by being held to have forfeited his to disability benefits should he stop working.” Hawkins v. First Union Long-Term Disability Plan, 326 F.3d 914, 918 (7th Cir.2003). In this case, however, it was Denmark and her physicians who initially claimed that her condition worsened around the date of disability. Under such circumstances it was not unreasonable for Liberty to consider the lack of evidence to substantiate Denmark's self-reported worsening condition.
     
      
      . The dissent contends that significant on Denmark's stamina would preclude her from performing the duties of any However, Dr. Schur provided no information regarding such limitation. Although he recommended a course of treatment "to try and cut down on pain and help her sleep ... so that she isn't so tired and, hopefully, can get some of her stamina back,” this general comment merely recognizes some amount of fatigue without acknowledging that Denmark’s stamina is impaired to such a degree that she cannot work.
     
      
      . Liberty's letter reaffirming its denial of long-term disability benefits does cite the absence of “support of a severity of impairment that would preclude Ms. Denmark from performing her own occupational job duties,” but, as the lead opinion observes, a contemporaneous explanation in Liberty’s case log again notes the "limited evidence to support a need to cease working.”
     
      
      . The lead opinion devalues Dr. Schur’s report based on its view that it does not speak to the current level of Denmark's fatigue or its effect on her ability to do her job. But, again, the report states that Denmark’s fatigue has left her unable to work or to care for her household. I do not know how the report could have described the level of fatigue any more clearly, or pertinently.
     
      
      . The claimant in Cook suffered from asthma, chronic fatigue syndrome, and fibromyal-gia. 320 F.3d at 15.
     
      
      .Liberty also subjected the IME to peer review by Dr. Bomalaski, who concluded that ''[t]he clinical medical evidence does not clearly support severe impairment because as noted, the diagnosis of fibromyalgia remains in question....” As Judge Lipez notes, this conclusion is irrelevant because Liberty has never contested the diagnosis itself. The closest Dr. Bomalaski's report comes to supporting Liberty's position that Denmark is not disabled is its observation that "[l]imitations of ability to function are difficult to assess...." The report therefore does little to bolster Liberty's decision.
     
      
      . The lead opinion finds Liberty’s criticism of the IME "not unreasonable given that Dr. Schur himself had limited his evaluation to 'the time being.’ ” While Dr. Schur’s report indeed contains the phrase "at least for the time being,” it is clearly not intended to limit Dr. Schur's opinion retrospectively, but prospectively: in both the paragraph immediately preceding the phrase and the sentence immediately following it, Dr. Schur suggests interventions for Denmark that "hopefully will improve matters, so that she can get her stamina back and get back to work.” Accordingly, I do not see how the phrase "for the time being” can reasonably be read to support Liberty’s exceedingly narrow view of Dr. Schur's opinion.
     
      
      . While Liberty also relied on its surveillance of Denmark to support its decision, I agree with the district court that this evidence does not speak to whether she can consistently work full days as her job requires. Denmark was observed leaving her house on only two of the four days she was being watched and, even then, was out for only a few hours.
     