
    Roger L. BAKER, Appellee, v. Jo Anne B. BARNHART, Commissioner of Social Security, Appellant.
    No. 05-3106.
    United States Court of Appeals, Eighth Circuit.
    Submitted: March 14, 2006.
    Filed: June 13, 2006.
    
      Mark W. Pennak, argued, U.S. Department Of Justice, Civil Division, Washington, DC (Leonard Schaitman, U.S. Department Of Justice, Civil Division, Jennifer L. Fisher, Office of General with the SSA, on the brief), for appellant.
    Jay E. Denne, argued, Sioux City, IA, for appellee.
    Before COLLOTON, HEANEY, and GRUENDER, Circuit Judges.
   GRUENDER, Circuit Judge.

The Commissioner of Social Security (“Commissioner”) appeals the district court’s order reversing the Commissioner’s denial of Supplemental Security Income (“SSI”) disability benefits to Roger L. Baker. The Commissioner argues that the district court erred in relying on materials not contained in the administrative record, that substantial evidence supports the Commissioner’s decision to deny benefits and that even if substantial evidence does not support the Commissioner’s decision, the proper remedy is a remand to the Social Security Administration (“SSA”) rather than an order directing the award of benefits. We reverse the district court and remand with instructions to reinstate the Commissioner’s denial of benefits.

I. BACKGROUND

In February 2000, Baker slipped and fell on ice and injured his back. Baker’s family physician, Dr. Kissel, ordered an MRI after the muscle relaxants and pain killers he prescribed failed to improve Baker’s condition. The MRI showed degenerative disc disease, bulging disc material and “neuroforamen stenosis.” Dr. Kissel referred Baker to a neurologist, Dr. Case. Dr. Case confirmed Baker’s back problems and also diagnosed Baker with moderate carpal tunnel syndrome and tennis elbow in his left arm. Dr. Case referred Baker to a surgeon, Dr. Durward, for further treatment of his lumbar disc disease. Dr. Durward scheduled Baker for back surgery.

In September 2000, Baker informed his employer, Enterprise Rent-a-Car, that he could no longer work because of the pain in his back and legs. Dr. Durward and his associate, Dr. Noel, performed surgery on Baker in November 2000 to fuse two lumbar vertebrae. Regarding a follow-up examination on January 8, 2001, Dr. Dur-ward stated in a memorandum, “As far as his back goes he is doing extremely well. Lost pretty well all of his preoperative pain syndrome.... Moves his back freely. His straight leg raising is unrestricted.... The films that were done last month demonstrate perfect fusion occurring [in the fused lumbar vertebrae].” After conducting an examination on January 29, 2001, Dr. Noel noted that Baker was “[d]oing well” and “[d]enies any complaints.” On April 23, 2001, Dr. Noel noted that he “encouraged [Baker] to get back to work and he is excited by that.” Dr. Noel recommended a functional capacity evaluation (“FCE”) “to determine [Baker’s] final restrictions.” Meanwhile, Baker applied for SSI disability benefits on February 28, 2001, but this application was denied on May 10, 2001 because his condition was “not expected to prevent [him] from working for a continuous period of 12 months from the date of [his] surgery.”

On June 15, 2001, Baker was examined by Dr. Durward. This time, Dr. Durward noted that Baker was “still complaining of significant pain” that was “[w]orse with physical activity, even light housework in a bent forward position exacerbates it.” Dr. Durward characterized Baker’s case as a “[p]uzzling situation. Seems to have an inordinate amount of pain despite what appears to be a satisfactory x-ray result.” Dr. Durward referred Baker to Dr. Kep-pen, a pain specialist, and recommended delaying the FCE until Baker’s pain situation was addressed. Baker returned to Dr. Durward on August 6, 2001, and Dr. Durward noted that Baker’s reported pain had not improved and that physical activity such as raking exacerbated the pain. However, Dr. Durward also noted that Baker could walk and perform straight leg raising without restriction. Dr. Durward released Baker for light-to-medium work, restricting him to a maximum occasional lifting limit of 25 pounds, a frequent lifting limit of 15 pounds and to limited bending at the waist. Dr. Durward also stated that an FCE would enable him to implement more accurate work restrictions.

Baker next visited Dr. Durward on November 11, 2001, complaining of continuing pain in his back and left forearm. Baker stated that he had raked leaves the previous day and that doing so exacerbated his back pain. Dr. Durward noted:

I believe that the fusion is solid and [Baker] is not any longer getting any pain from it. I have released him previously for work with lifting limits but he has not gone back to work. He has applied for Social Security. I think I would give the man the benefit of the doubt here and try one more method of treating his residual symptoms. What I would recommend is referral to Dr. Mike Donohue for consideration of an isokinetic rehabilitation program.

Baker returned to Dr. Durward on January 4, 2002 with the same complaints of pain. This time Dr. Durward noted:

Very difficult situation. We had referred him for isokinetic rehab but that was not undertaken. I think there are 2 things that we need to pursue here. Firstly, it would be worthwhile getting a second opinion to see whether there was something we have missed or something else that may be done to help his residual pain.... The second thing is that I think it would be of value to have a[FCE] done after he has had the second opinion to really try to identify exactly what his limitations are. This man is only 49. He tells that Social Security was turned down because his wife does work. I think he is motivated to do some kind of work if we can find something that does not exacerbate his residual pain symptoms.

Baker received the recommended second opinion from Dr. Lynn, but Dr. Lynn could not identify any cause for Baker’s pain and did not recommend any new treatment. Baker returned to Dr. Dur-ward on May 29, 2002, after Dr. Durward had a chance to review the findings of Dr. Lynn as well as those of Dr. Keppen, the pain specialist. Baker reported that his pain had not decreased and that prolonged sitting, such as during an extended drive in an automobile, exacerbated his pain. Baker also reported that he was taking no pain medication other than “intermittent Tylenol.” Dr. Durward noted:

This man does describe a significant disabling pain syndrome. At this point in time I do not feel confident that there is any structural cause for it that I could help----As far as the pain goes, I am going to give him samples of a non steroidal Bextra or Mobic to see if that gives him some relief. My recommendations are that he can only work on a light duty occupation. I would recommend a maximum lifting limit of 25 lbs., frequent lifting limit 10 lbs. Avoid bending at the waist. However because this is a complicated situation and he has significant pain I would like to get [an FCE] to try and determine more accurately what his limitations are.

Baker participated in an FCE with licensed physical therapist Terry F. Nelson on July 23, 2002. The FCE results showed evidence that Baker was exaggerating his symptoms and giving less than full effort during testing. Because the district court found serious fault with these FCE results, we discuss the FCE methodology in detail.

The FCE includes multiple evaluations designed to produce an objective measure of the patient’s effort and cooperation with the goal of the test. The Pain Replication Test (“PRT”) is designed to “determine whether or not an organic pain syndrome is present and how much it is limiting normal function.” The detailed FCE report describes the nature of the PRT as follows:

The rationale behind the PRT is that a static effort up to a barely perceptible increase in pain is reproducible within a small margin of error, a 15% Coefficient of Variation (CV).... [If] a barely perceptible increase in pain is experienced, then that force is not enough to produce an injury or an aggravation to the injury, but that level of force is reproducible. However, if the patient is exaggerating their [sic] symptoms and are not exerting their best effort up to a barely perceptible increase in their pain, then it will be difficult for them to produce static exertions with good consistency.
[The patients then lift] as hard as they are able to do safely. They must stay within their perceived safety zone, while attempting to determine their maximum tolerable pain level of force....
The person with an organic pain syndrome in their Low Back and Upper Body will be able to produce two distinctly different levels of force exertion, but the symptom exaggerator will not be able to produce two distinct levels of force and they will occasionally produce results that are impossible.

Nelson concluded in the detailed FCE report that “Mr. Baker demonstrated a non organic pain response during which he was unable to consistently replicate two levels of force based on his pain perception. The result is usually consistent with voluntary submaximal effort on this test.”

In addition to the PRT, the FCE included a Blankenship Reliability Profile. The detailed FCE report states, “The Blankenship Reliability Profile includes two objective components, Non Organic Signs and Validity. Patients scoring invalid on both components are felt to be attempting to control the test results to demonstrate a greater level of disability than what is actually present, the motivation of which is not known.” The Non Organic Signs measure is based on comparisons of the patient’s movement patterns with the patient’s own description of his pain, and also on “distraction” observations of those movements:

The Distraction category is defined as any improvement in the Movement Dysfunction when the patient is not aware that they [sic] are being observed, compared back to observations ... when the patient was fully aware that they were being observed____A positive Distraction Category is evidence that the patient is attempting to demonstrate a greater level of pain disability than are [sic] actually present.

In his summary letter to Dr. Durward, Nelson stated, “Mr. Baker exhibited distracted movement patterns throughout the [FCE], This patient demonstrated trunk range of motion that improved significantly with distraction.” Nelson also noted, “When comparing this patient’s movement patterns throughout the [FCE] to the patient’s complaints of pain throughout the [FCE], it is felt by this evaluator, that the patient demonstrated a poor correlation between movement patterns and his complaints of pain.”

Finally, the detailed FCE report describes the Validity component of the Blankenship Reliability Profile:

The Validity Profile is comprised of a cohort of individual tests that collectively help determine whether or not the patient is exerting their [sic] best effort during all of the FCE tests.... If the patient exerts effort up to the point of a barely perceptible pain increase, or slightly below that level so there is no pain increase at all, then they will pass the overall Validity Profile. If the patient does not pass the overall Validity Profile, then they [sic] have not exerted their best effort.... Current research, submitted ... [for publication], shows that a strong indicator of whether or not an evaluee is cooperating with and exerting their [sic] best effort on a functional test is the Evaluator’s professional judgment; however, most of the Validity Criteria of The Blankenship System FCE are based on published research.
There is also an empirical relationship between the number of validity criteria passed and the degree of effort exerted during testing and the reliability of the test results for predicting work performance. This algorithm was developed by K. Blankenship published in the revised edition of, The Blankenship System Functional Capacity Evaluation: The Protocol Manual, revision began in 1996, Macon, GA. That revised algorithm is shown below.

Validity Criteria Passed Degree of Effort Word Descriptor

90-100% Excellent Effort Valid Results

80-89% Good Effort Valid Results

70-79% 70-75% Pair Effort Valid Results Borderline Valid, Results are Conservative

60-69% Poor Effort Borderline Invalid Results

<60% Invalid Results Very Poor Effort

< 20 Criteria May Be Unreliable, Professional Judgment Required

Nelson’s summary letter to Dr. Durward read, “This patient exhibited a sub maximal effort throughout the [FCE], Mr. Baker passed 62% of the validity criteria!,] 36/58 validity criteria scored.... The results of the [FCE] will not provide an accurate aid in the medical management and vocational planning for this patient.” Nelson concluded that the FCE results, taken at face value, showed an ability to work at a “sedentary” level. However, taking into account Baker’s less-than-maximal effort and symptom exaggeration, Nelson estimated that Baker could handle “light-medium” work.

Baker returned to Dr. Durward on August 5, 2002. Dr. Durward, based on his own judgment, agreed with the FCE report that Baker could work at the light-medium physical demand category. However, Dr. Durward stated, “He has an eighth grade education. I am very doubtful that he is going to be able to find meaningful work within the light job classification. It may be reasonable for him to apply for Social Security.”

Baker returned to Dr. Kissel for a check-up on October 24, 2002. Dr. Kissel noted that Baker “says he is unable to do anything. He can’t bend over. He can’t tie his sho[e]s. He can’t even brush his teeth without having severe pain in his back.” Dr. Kissel reviewed Dr. Durward’s notes and recognized that Dr. Durward felt Baker’s functional capacity “was better than [Baker] thinks it is.” Dr. Kissel agreed that applying for SSI disability benefits was “a good idea.” Dr. Kissel also noted Baker’s borderline high blood pressure.

In December 2002, Baker filed a new application for SSI disability benefits, stating that he suffered from lower back pain, carpal tunnel syndrome, and high blood pressure. In the application, Baker stated that he operated his own tree-trimming business but did none of the physical labor. Baker added that because of the pain, he no longer mowed or raked his yard and that he had trouble using an eight-pound chain saw to cut tree branches. In a pain questionnaire, Baker stated that he vacuumed the floor and helped his wife with the dishes “once in a while,” but avoided any bending and leaning while doing so. He also stated that he drove his son to and from work and occasionally drove to the grocery store, but that the pressure of the car seat aggravated his pain.

Baker received a hearing before an administrative law judge (ALJ) on October 30, 2003 and testified that he was born in 1952, completed his education through the eighth grade and had not taken a GED test, was married and had one adult son who currently lived with Baker and his wife. Baker also stated that he was self-employed, devoting ten to twelve hours per week to a tree-trimming service for which he solicited customers through local ads, drove himself to customers’ sites and gave estimates for the jobs, and then hired part-time labor to do the physical work. He was no longer able to wield his small chain saw for more than a few minutes. He took no pain medications because none of them helped. In general, he was able to sit comfortably for only about half an hour at a time and then needed to stand up and walk around. He was able to sleep only two to three hours at a time without interruption from the pain. Driving 100 miles caused him serious pain.

In an opinion issued January 28, 2004, the ALJ ruled that Baker was not disabled. Applying the five-step evaluation set forth in 20 C.F.R. § 404.1520, the ALJ determined that Baker’s tree-trimming business did not qualify as “substantial gainful activity” for the purposes of step (i). The ALJ also found that Baker’s lower back impairment, while severe for purposes of step (ii), did not meet or equal one of the listings in 20 C.F.R. pt. 404, subpt. P, app. 1, as required in step (iii). Therefore, the ALJ proceeded to analyze Baker’s residual functional capacity (“RFC”) for step (iv):

[T]he claimant’s subjective complaints are inconsistent with the medical evidence of record. A functional capacity evaluation found evidence of non-valid criteria and evidence of deconditioning. Moreover, he takes no medication for his pain in spite of rather significant complaints. He is capable of full self-care, does a wide variety of household chores and outdoor tasks, including car washing, mowing the lawn, and raking leaves. He drives a car every day, shops and runs a number of errands.
The undersigned finds that the claimant performs a significant amount of activities of daily living for an individual with such longstanding intractable pain.
All the physicians in the file said the claimant could perform light, (or medium), work or that there was no etiology for the alleged pain symptoms.

The ALJ recited the results of Baker’s various medical. tests and evaluations and found that Baker retained the RFC to perform light work, specifically finding that “the claimant’s allegations regarding his limitations are not totally credible.” Finally, for step (v), the ALJ applied the Medical-Vocational Guidelines of 20 C.F.R. pt. 404, subpt. P, app. 2 to Baker’s RFC, age, education, and work experience. Rule 202.10 of the Medical-Vocational Guidelines directed a finding of “not disabled.” The Appeals Council denied administrative review of the ALJ’s decision, making the ALJ’s decision the final decision of the Commissioner. Baker sought review in federal district court.

The district court reversed the ALJ’s decision and ordered that SSI disability benefits be awarded to Baker. The district court ruled that the ALJ’s finding that Baker “does a wide variety of household chores and outdoor tasks, including car washing, mowing the lawn, and raking leaves” and “performs a significant amount of activities of daily living” was not supported by substantial evidence on the record as a whole because Baker indicated that merely attempting such activities exacerbated his pain. The district court also faulted the ALJ’s reliance on the fact that Baker took no pain medication. However, the district court devoted most of its analysis to what it deemed to be a flawed FCE report. The district court did not accept that the FCE could show Baker to be giving valid effort 62 percent of the time and “cheating” 38 percent of the time. The district court stepped through the re-suits from the individual FCE tests and repeatedly made comments such as “[f]or some reason, this cheater did not cheat on this category” and:

They have been testing him for a number of things. According to the test administrator, he has cheated. They are now testing him for static strength and he made a good effort and passed twenty-two out of twenty-five. This demonstrates, from the test administrator’s point of view, that the plaintiff is now well into the tests and has been cheating all along but he then decides, “hey, maybe I am cheating too much” so he did not cheat at all in this category and the test administrator concludes, “it was a good effort on his part.”

In summary, the district court discredited as a “hard-to-believe scenario” any conclusion that “a person who admittedly has not been a great student, whose education is limited” could “put on [such] a real show of deceit.” The district court also found, without citing to any evidence, that Nelson did not give proper consideration to Baker’s demonstrated poor cardiovascular fitness or “deconditioning” in evaluating Baker’s level of effort on the FCE tests. Acknowledging that neither Baker’s doctors nor the ALJ saw any problems with the FCE, the district court stated that “[n]ot only did the ALJ not know what was in the test, the doctors seem to have bought some of it without knowing how Nelson arrived at his scores.”

Finally, the district court took judicial notice of an article (“Soderberg article”) not contained in the administrative record to discredit Baker’s FCE results. The district court characterized the article as a treatise. Relying on a brief general criticism in the Soderberg article of the use of functional capacity evaluations, the district court found that the FCE methods employed by Nelson were not reliable and that “[t]he ALJ relied too heavily on this seemingly unreliable test.”

With the FCE results discredited, the district court found that the record overwhelmingly supported a finding of disability and directed the Commissioner to compute and award benefits. The Commissioner appeals, arguing that the district court erred in relying on the extra-record Soderberg article and that substantial evidence supports the ALJ’s decision. In the alternative, the Commissioner argues that even if substantial evidence does not support the ALJ’s decision, the proper remedy is a remand to the SSA rather than an order directing the award of benefits.

II. DISCUSSION

A. Consideration of Evidence Outside the Administrative Record

The parties contest the role played in the district court’s decision by the extra-record Soderberg article, noticed sua sponte by the district court. Baker contends that the district court did not ascribe evidentiary value to the extra-record article, but instead characterized it as a treatise. In doing so, the district court relied on our decision in United States v. Eagleboy, 200 F.3d 1137, 1140 (8th Cir.1999), in which we held that judicial notice of a document was permissible where the document “[did] not present new evidence on a disputed question of fact.” We stated further that “judicial opinions, treatises, law review articles, public records, and the like ... may be cited for the first time on appeal in support of a legal theory that was raised in the trial court.” Id.

The Commissioner, on the other hand, contends that the Soderberg article constitutes extra-record expert opinion evidence. In the context of judicial review of a decision of the Commissioner regarding SSI disability benefits, evidence outside the administrative record generally is precluded from consideration by the court. Delrosa v. Sullivan, 922 F.2d 480, 483 (8th Cir.1991); see also Johnson v. Chater, 108 F.3d 942, 946 (8th Cir.1997) (holding, where the claimant urged the court to take judicial notice of the weight of a gallon of gasoline to controvert the ALJ’s finding regarding his lifting ability, that “to take judicial notice of a fact such as the one [the claimant] suggests would undermine the ALJ’s role as the factfinder under the Social Security Act”); 42 U.S.C. § 405(g) (“The court shall have power to enter, upon the pleadings and transcript of the record, a judgment affirming, modifying, or reversing the decision of the Commissioner of Social Security, with or without remanding the cause for a rehearing.” (emphasis added)).

We find that the district court abused its discretion by taking judicial notice of the Soderberg article. See Johnson, 108 F.3d at 946 (noting that the court “has discretion as to whether to take judicial notice”). The district court took judicial notice of the article as a treatise. A treatise is defined by Merriam-Webster’s Collegiate Dictionary (11th ed.) as “a systematic exposition or argument in writing including a methodical discussion of the facts and principles involved and conclusions reached.” In addressing the treatise exception to the hearsay rule, the Second Circuit stated:

Learned treatises are considered trustworthy because “they are written primarily for professionals and are subject to scrutiny and exposure for inaccuracy, with the reputation of the writer at stake.” Fed.R.Evid. 803(18) advisory committee note. Failure, therefore, to lay a foundation as to the authoritative nature of a treatise requires its exclusion from evidence because the court has no basis on which to view it as trustworthy.

Schneider v. Revici, 817 F.2d 987, 991 (2d Cir.1987).

This logic applies with equal force where a court takes judicial notice of a treatise. The district court did not identify, and we cannot discern, any indication that the So-derberg article qualifies as a treatise of an authoritative nature. The article itself states, “This note provides a brief overview of various systems or techniques used in assessing a patient’s capacity to perform.” (Emphasis added.) The article, which is only five pages long, makes conclusory statements about several methods of testing human muscle performance and cites published sources for those statements, but the article does not include any “methodical discussion of the facts and principles involved.”

Furthermore, there is no indication that the article itself has been published in a peer-reviewed journal. The article apparently is available only on the internet at the web site of Cost Reduction Technologies, LLC and is written by that company’s Chief Scientist and Technical Director. That company promotes, and sells testing equipment for, “isokinetic” testing, an alternative to the FCE methods employed by Nelson. Given the limited depth of the article and the author’s and publisher’s interests in selling alternative testing equipment, there is no basis upon which we may conclude that the article’s brief criticism of the FCE is trustworthy in the nature of an authoritative treatise. Therefore, the district court abused its discretion by taking judicial notice of the Soder-berg article and relying upon it to discredit the FCE results.

B. The Decision of the ALJ

The ALJ found that Baker retained the RFC to do light work and then applied the Medical-Vocational Guidelines to determine that Baker was not disabled. “The findings of the Commissioner of Social Security as to any fact, if supported by substantial evidence, shall be conclusive.” 42 U.S.C. § 405(g). Our standard of review of the ALJ’s decision is as follows:

Our role on review is to determine whether the Commissioner’s findings are supported by substantial evidence on the record as a whole. Substantial evidence is less than a preponderance, but is enough that a reasonable mind would find it adequate to support the Commissioner’s conclusion. In determining whether existing evidence is substantial, we consider evidence that detracts from the Commissioner’s decision as well as evidence that supports it. As long as substantial evidence in the record supports the Commissioner’s decision, we may not reverse it because substantial evidence exists in the record that would have supported a contrary outcome or because we would have decided the case differently.

McKinney v. Apfel, 228 F.3d 860, 863 (8th Cir.2000) (internal citations omitted).

Baker challenges the ALJ’s rejection of Baker’s assertions regarding his physical limitations, such as his claim that he must alternate between sitting and standing every half-hour. However, substantial evidence in the record as a whole supports the ALJ’s finding that Baker’s “allegations regarding his limitations are not totally credible.” The ALJ was entitled to draw conclusions about Baker’s credibility based on the FCE pain-replication and distraction analyses indicating that Baker was exaggerating symptoms and giving less than his full effort. See Clay v. Barnhart, 417 F.3d 922, 930 n. 2 (8th Cir.2005) (noting that two psychologists’ findings that the claimant was “malingering” on her IQ tests cast suspicion on the claimant’s motivations and credibility); Jones v. Callahan, 122 F.3d 1148, 1152 (8th Cir.1997) (holding that a physician’s observation “of the discrepancies in [the claimant’s] appearance in the examining room and those outside when he did not know that he was observed” supported an ALJ’s finding that the claimant’s complaints were not fully credible).

Baker also alleges that the ALJ improperly discredited Baker’s subjective complaints of pain due to the lack of objective evidence. While “an ALJ may not disregard subjective pain allegations solely because they are not fully supported by objective medical evidence, an ALJ is enti-tied to make a factual determination that a Claimant’s subjective pain complaints are not credible in light of objective medical evidence to the contrary.” Ramirez v. Barnhart, 292 F.3d 576, 581 (8th Cir.2002) (internal citation omitted). In this case, the ALJ discredited Baker’s subjective complaints of pain after considering the indications of symptom exaggeration during the FCE, Baker’s choice not to take pain medication, his ability to do a wide variety of chores and otherwise perform “a significant amount of activities of daily living,” and the failure of repeated examinations by Dr. Durward, Dr. Noel and Dr. Lynn to uncover any physical explanation for Baker’s reported lower back and hip pain.

Baker specifically challenges the ALJ’s finding that he “does a wide variety of household chores and outdoor tasks, including car washing, mowing the lawn, and raking leaves.” Baker contends that the record merely reflects that when he attempted to rake leaves or do other chores, his pain prevented him from doing so. An examination of the record reveals that Baker told Dr. Durward during his various visits that “even light housework in a bent forward position exacerbates” the pain (June 15, 2001); that raking exacerbated the pain (August 6, 2001); and, again, that raking exacerbated the pain (November 11, 2001). In his December 2002 application for SSI disability benefits, Baker stated that he no longer mowed or raked his yard, that he had trouble using an eight-pound chain saw to cut tree branches, and that he vacuumed the floor and helped his wife with the dishes “once in a while.” Finally, Baker reiterated before the ALJ that he was no longer able to wield his small chain saw for more than a few minutes and that he was able to help with the dishes in a minimal fashion for 15 or 20 minutes at a time.

We find that this evidence does not constitute substantial evidence on the record to support the ALJ’s finding that Baker “does a wide variety of household chores and outdoor tasks, including car washing, mowing the lawn, and raking leaves,” at least to the extent that the finding suggests Baker performed those activities for more than a few minutes at a time. However, that finding was just one component of the ALJ’s finding that Baker’s subjective complaints were not consistent with the record evidence. The record does contain substantial evidence to support the ALJ’s findings that Baker “is capable of full self-care” and “drives a car every day, shops and runs a number of errands.” These findings, especially those regarding the activities associated with Baker’s tree-trimming business, support the conclusion that Baker performed “a significant amount of activities of daily living.”

In addition, the ALJ also discredited Baker’s subjective complaints of pain based on the indications of symptom exaggeration during the FCE, Baker’s choice not to take pain medication and the absence of an etiology for the alleged pain symptoms. All these grounds are supported by substantial evidence on the record as a whole. First, as discussed above, the report of symptom exaggeration on the FCE provides good cause for the ALJ to discredit Baker’s subjective complaints of pain. Second, Baker’s decision not to take pain medication was a valid factor for the ALJ to consider. See Curran-Kicksey v. Barnhart, 315 F.3d 964, 969 (8th Cir.2003) (“[Ejvidence that [the claimant] did not regularly require prescription medication or physical therapy could create doubt in a reasonable adjudicator’s mind with regard to her testimony about the extent of her pain.”). Third, the numerous medical opinions from Baker’s treating physicians agree that Baker’s reports of pain were inconsistent with his physical condition. “A treating physician’s opinion is due controlling weight if that opinion is well-supported by medically acceptable clinical and laboratory diagnostic techniques and is not inconsistent with the other substantial evidence in the record.” Ellis v. Barnhart, 392 F.3d 988, 995 (8th Cir.2005) (quoting Hogan v. Apfel, 239 F.3d 958, 961 (8th Cir.2001)) (internal quotation omitted).

“We ‘will not disturb the decision of an ALJ who considers, but for good cause expressly discredits, a claimant’s complaints of disabling pain.’ ” Goff v. Barnhart, 421 F.3d 785, 792 (8th Cir.2005) (quoting Gowell v. Apfel, 242 F.3d 793, 796 (8th Cir.2001)). In this case, the ALJ considered Baker’s subjective complaints of pain but found independent reasons, each supported by substantial evidence, to discredit those complaints. Therefore, we find that the ALJ did not improperly discredit Baker’s subjective complaints of pain.

After finding that substantial evidence supports the ALJ’s findings that Baker’s assertions of his limitations and his subjective complaints of pain are not wholly credible, we also find that substantial evidence supports the ALJ’s finding that Baker retained the RFC to do light work. Nelson concluded in the FCE report that Baker was capable of working in the light-medium category. In addition, after receiving the FCE report, Dr. Durward examined Baker again and agreed, independent of the FCE, that Baker was capable of work in the light-medium category. The FCE results and Dr. Durward’s opinion provide strong support for the ALJ’s finding that Baker retained the RFC to perform light work.

Baker contends that the ALJ should have given greater weight to the fact that both of his treating physicians, Dr. Kissel and Dr. Durward, encouraged him to apply for Social Security disability benefits. This argument fails. Dr. Durward agreed that Baker was capable of light/medium work but stated, “[Baker] has an eighth grade education. I am very doubtful that he is going to be able to find meaningful work within the light job classification.” Dr. Kissel reviewed Dr. Durward’s notes and agreed with his conclusion. However, a physician’s opinion regarding a claimant’s ability to find work within a particular classification is not a “medical opinion.” See Stormo v. Barnhart, 377 F.3d 801, 806 (8th Cir.2004) (“[Tjreating physicians’ opinions are not medical opinions that should be credited when they simply state that a claimant can not be gainfully employed, because they are merely opinions on the application of the statute, a task assigned solely to the discretion of the [Commissioner].” (internal quotation marks omitted)).

Finally, the ALJ properly applied the Medical-Vocational Guidelines to determine if Baker could make an adjustment to other work. Generally, where the claimant suffers from a nonexertional impairment such as pain, the ALJ must obtain the opinion of a vocational expert instead of relying on the Medical-Vocational Guidelines. Ellis, 392 F.3d at 996. However, the Guidelines still may be used where the nonexertional impairments “do[ ] not diminish or significantly limit the claimant’s residual functional capacity to perform the full range of Guideline-listed activities.” Id. (quotation omitted). In particular, “[w]hen a claimant’s subjective complaints of pain ‘are explicitly discredited for legally sufficient reasons artieulat-ed by the ALJ/ the Secretary’s burden [at the fifth step] may be met by use of the [Medical-Vocational Guidelines].” Naber v. Shalala, 22 F.3d 186, 189-90 (8th Cir.1994) (quoting Hutsell v. Sullivan, 892 F.2d 747, 750 (8th Cir.1989)). In this case, as discussed above, the ALJ expressly discredited Baker’s subjective complaints of pain for legally sufficient reasons; therefore, use of the Guidelines was proper.

We conclude that the ALJ’s determination that Baker was not disabled is supported by substantial evidence in the record as a whole. Therefore, we do not reach the issue of whether the district court erred in directing the award of benefits rather than remanding to the SSA for further proceedings.

III. CONCLUSION

We conclude that the district court erred in relying on materials not contained in the administrative record and in finding that the Commissioner’s denial of benefits was not supported by substantial evidence. Therefore, we reverse the district court and remand with instructions to reinstate the Commissioner’s denial of benefits.

HEANEY, Circuit Judge,

dissenting.

I would remand this case for rehearing because the ALJ failed to fully and fairly develop the record prior to determining Baker’s eligibility for social security disability payments. Thus, I respectfully dissent from the majority opinion affirming the Commissioner's denial of benefits.

It is settled law in this circuit that social security hearings are nonadversarial, and the ALJ is responsible, independent of the claimant’s burden, for fully and fairly developing the record. Snead v. Barnhart, 360 F.3d 834, 838 (8th Cir.2004). The duty to develop the record extends to cases like this one where the claimant is represented by counsel. Id. “The ALJ possesses no interest in denying benefits and must act neutrally in developing the record.” Id. Here, the ALJ failed in several instances to fully and fairly develop the record.

The ALJ found that Baker had not engaged in substantial gainful employment since the alleged onset of his disability in February 2002. The ALJ further found that Baker had a combination of impairments considered severe, but not medically equal to a listed impairment, and finally that, pursuant to the Medical-Vocational rule 200.10, he was not disabled.

The ALJ found that Baker could “perform the full range of light work,” (App. at 63), but failed to make a finding regarding Baker’s ability to return to his own past relevant work. This omission was serious and prejudicial, because if Baker could not return to his past relevant work, the burden would shift to the Commissioner to prove that there was other work in the national economy that Baker could perform in light of his severe impairments. Baumgarten v. Chater, 75 F.3d 366, 368 (8th Cir.1996).

Baker’s principal occupation from May 1983 to September 2000 was that of a tree trimmer. (Admin. R. at 106.) Until Baker fell and was injured in September 2000, he did all of the tree work himself. {Id. at 384.) Thereafter, he was limited to supervising others in the tree trimming operation. {Id. at 377-382.) His work as a tree trimmer is classified as heavy (Dictionary of Occupational Titles, § 408-.664-010), and clearly beyond his current functional capacity. Had the ALJ found that tree trimming was Baker’s past relevant work, the burden would have shifted to the Commissioner, as noted above.

Prior to working as a tree trimmer, Baker worked as an assembly line worker in a hog processing plant from October 1970 to May 1983. (Admin. R. at 106, 110.) This job is also classified as heavy, and therefore beyond Baker’s present capabilities. (Dictionary of Occupational Titles, § 525.381-014.) Thus, from 1970 to 2000, Baker’s past relevant work was in positions classified as “heavy,” and beyond his current functional capacity.

In addition to the full-time occupations discussed above, Baker supplemented his income with part-time jobs. From December 1994 to December 1999, Baker worked approximately fifteen hours a week as a laundry attendant. (Admin. R. at 106, 109.) From December 1999 to September 2002, Baker worked approximately thirty-three hours per week as an automobile detailer for Enterprise Rent A Car. {Id. at 106, 108.) The ALJ failed to determine whether Baker was capable of performing the duties of either of these past jobs.

The ALJ also erred in holding that Baker was not disabled because he took no pain medication. The ALJ’s statement that Baker “takes no medications for his pain in spite of rather significant complaints” is only half true. (App. at 61.) The record is replete with references to pain medications prescribed by Baker’s treating physicians and nothing supports an inference that he did not take the prescribed drugs. Baker did report on a Social Security Administration form that he was not “presently” taking pain medication. (Admin. R. at 128.) But Baker also testified that he quit taking the pain medication because it did not relieve his pain. {Id. at 384.) Quitting pain medication because is does not work is far different from quitting it because he was not in pain. In Baumgarten, 75 F.3d at 369, the claimant testified that she quit taking prescription pain medication because it was no more effective than Tylenol. This court held that the ALJ’s “erroneous assertion” that the claimant provided no reasonable explanation for discontinuing the pain medication called the ALJ’s ultimate decision into doubt. Id.

As the majority notes, the ALJ also incorrectly determined that Baker’s complaints of pain were inconsistent with his performance of activities of daily living. The ALJ improperly embellished Baker’s activities, noting that he did a wide variety of household chores and outdoor tasks, including car washing, mowing the lawn, and raking leaves. As to household chores, Baker testified:

I don’t do much around the house. I help my wife barely with the dishes. I take them out of the sink and I stand at the counter. I hand them — the dishes to my wife, she bends over and leans over to put the dishes in the dishwasher. And once in a while I’ll wash the pots and pans at the kitchen sink. And about 15, 20 minutes of doing that it’s just— it’s really hard on me. I got to quit doing it.

(Admin. R. at 381.) The ALJ asked no further questions regarding household chores. The ALJ made its determination regarding car washing, mowing the lawn, and raking leaves based on the daily activities questionnaire. In answering the questionnaire, Baker stated that he “rarely ” washed the car, “rarely ” mowed the lawn, and “rarely ” raked the leaves. (Id. at 124 (emphasis added).) Certainly it was improper for the ALJ to omit the fact that Baker only rarely did these chores.

Furthermore, this court has consistently held that “the ability to do activities such as light housework and visiting with friends provides little or no support for the finding that a claimant can perform full-time competitive work.” Hogg v. Shalala, 45 F.3d 276, 278 (8th Cir.1995); see also Baumgarten, 75 F.3d at 369 (finding the claimant’s ability to make the bed, prepare food, perform light housekeeping, grocery shop, and visit friends an unpersuasive reason to deny benefits). Accordingly, Baker’s failure to take prescription pain medication and his rare performance of daily activities do not support the ALJ’s conclusion that Baker is not entitled to disability benefits.

The ALJ gave significant weight to the residual functional capacity evaluation, which stated in part:

1.This patient exhibited a sub maximal effort throughout the functional capacity evaluation. Mr. Baker passed 62% of the validity criteria 36/58 validity criteria scored.... The results of the functional capacity evaluation will not provide an accurate aid in the medical management and vocational planning for this patient.
2. Mr. Baker demonstrated during the functional capacity evaluation, an ability to work at a Sedentary physical demand level. This would allow an individual to perform occasional material handling activities with 10 lbs and less weight, 1-33% of the day. This is according to the Dictionary of Occupational Titles, U.S. Department of Labor, 1991. The results are considered this patient’s minimal level of function and his maximal level of function must be left to conjecture....
3. Estimated Physical Demand Level— The results of the functional capacity evaluation demonstrated an invalid result. Mr. Baker during the Occasional Material handling tests demonstrated the ability to perform material handling activities at a Sedentary Level, 10 lbs and less, however, the results of the Occasional material handling tests were invalid.... This patient will be placed at a LighL-Medium Physical Demand Level. This would allow an individual to perform occasional material handling activities with 35 lbs and less. This is according to the Dictionary of Occupational Titles, U.S. Department of Labor, 1991.

(Commissioner’s Factual Addendum for Appellant’s Br. at 69B.)

The district court rejected the ALJ’s finding and questioned the validity of the test. The district court concluded that it was highly improbable that a 50-year-old man with an eighth grade education could manipulate the test, that he could have a valid effort on 62% of the tests and invalid on the remainder. In plain language, the court opined that it could not accept the tester’s conclusion that Baker had selectively cheated on the test, and refused to give the test weight in its decision. To buttress its opinion, the district court improperly considered and quoted from a “treatise” by Gary L. Soderberg. Irrespective of the Soderberg article, the record sufficiently raises concerns regarding the FCE as a basis for awarding or denying disability benefits. This question deserves careful study and consideration by the ALJ and the Commissioner in the first instance.

Thus, I would remand to the ALJ for a hearing consistent with this dissent, that includes an opportunity to fully explore the propriety of the use of the FCE in this case. The ALJ has the responsibility to fully and fairly develop the record, recognizing that because Baker cannot return to his past relevant work, the burden shifts to the Commissioner to show that there are jobs in the national or regional economy in meaningful numbers that Baker can perform despite his disabilities. 
      
      . Nelson referenced the terms "sedentary” and "light-medium” to the Dictionary of Occupational Titles, U.S. Department of Labor, 1991.
     
      
      . The five-step evaluation is as follows:
      (i) At the first step, we consider your work activity, if any. If you are doing substantial gainful activity, we will find that you are not disabled.
      (ii) At the second step, we consider the medical severity of your impairment(s). If you do not have a severe medically determinable physical or mental impairment that meets the duration requirement in § 404.1509, or a combination of impairments that is severe and meets the duration requirement, we will find that you are not disabled....
      (iii) At the third step, we also consider the medical severity of your impairment(s). If you have an impairment(s) that meets or equals one of our listings in appendix 1 of this subpart and meets the duration requirement, we will find that you are disabled ....
      (iv) At the fourth step, we consider our assessment of your residual functional capacity and your past relevant work. If you can still do your past relevant work, we will find that you are not disabled....
      (v) At the fifth and last step, we consider our assessment of your residual functional capacity and your age, education, and work experience to see if you can make an adjustment to other work. If you can make an adjustment to other work, we will find that you are not disabled. If you cannot make an adjustment to other work, we will find that you are disabled....
      20 C.F.R. § 404.1520(a)(4).
     
      
      . "Residual functional capacity" is defined as the most an individual can still do despite the "physical and mental limitations that affect what [the individual] can do in a work setting" and is assessed based on all medically determinable impairments, including those not found to be "severe.” 20 C.F.R. § 404.1545.
     
      
      . Gary L. Soderberg, "A Note on Methods of Testing for Human Performance Capacity,” available at http://www.costreduction-tech.com/pdAaliditypaper.pdf (last visited May 2, 2006).
     
      
      . If a party shows good cause that new material evidence should be considered, the district court may remand to the agency for consideration of that evidence. Delrosa, 922 F.2d at 483-84; see 42 U.S.C. § 405(g).
     
      
      . With regard to the FCE's conclusions about effort and symptom exaggeration, we find that the district court's characterization of the physical therapist as merely watching each FCE test and deciding on an ad hoc basis if the subject is “cheating” on that test is not supported by the record. Instead, the record indicates that the FCE's conclusions about overall effort and symptom exaggeration are drawn in an empirical fashion by comparing the results of a large number of tests and observations. Because Baker did not submit any evidence to the ALJ challenging the reliability of the FCE methods employed by Nelson, and because those methods were accepted by Dr. Durward and Dr. Kissel, Baker's treating physicians, we see no reason not to accept the FCE results.
     
      
      . Although not cited by the ALJ, Dr. Dur-ward’s notes of January 4, 2002 reflect that Balter also did not undertake the recommended course of physical therapy.
     
      
      . The dissent notes that the ALJ did not expressly find that Baker could not return to his past relevant work. However, this finding was made implicitly when the ALJ proceeded past step (iv) and reached step (v) of the 20 C.F.R. § 404.1520 analysis. See footnote 2, ante. The absence of an express finding did not prejudice Baker with respect to burden-shifting at step (v) because use of the Medical-Vocational Guidelines is sufficient to meet the Commissioner's burden. See Naber, 22 F.3d at 189-90 ("[T]he Secretary’s burden [at the fifth step] may be met by use of the [Medical-Vocational Guidelines].” (quoting Hutsell, 892 F.2d at 750) (alterations in Naber )).
     
      
      . The initial examiner for the Social Security Administration determined that Baker "is not able to return to his previous vocation.” (Admin. R. at 140.)
     
      
      . See, e.g., Admin. Rec. at 143-145, 148-150, 157-158, 160, 175, 180, 184, 189, 192, 202, 216.
     