
    Karen S. GARRISON, Plaintiff-Appellant, v. Carolyn W. COLVIN, Commissioner of Social Security Administration, Defendant-Appellee.
    No. 12-15103.
    United States Court of Appeals, Ninth Circuit.
    Argued and Submitted March 10, 2014.
    Filed July 14, 2014.
    
      Mark Caldwell, Caldwell & Ober, Phoenix, AZ, for Plaintiff-Appellant.
    Laura H. Holland (argued), Special Assistant United States Attorney, Social Security Administration, Office of the General Counsel, Denver, CO; John S. Leonardo, United States Attorney, Michael A. Johns, Assistant United States Attorney, United States Attorneys’ Office, Phoenix, AZ; John Jay Lee (of counsel), Regional Chief Counsel, Region VIII, Social Security Administration, Office of the General Counsel, Denver, CO, for Defendant-Appellee.
    Before: JEROME FARRIS, STEPHEN REINHARDT, and A. WALLACE TASHIMA, Circuit Judges.
   OPINION

REINHARDT, Circuit Judge:

Karen Garrison appeals from a denial of Social Security benefits, arguing that the Administrative Law Judge (“ALJ”) erred in rejecting her symptom testimony and in assigning little weight to the opinions of her treating medical caretakers. In a decision that the Commissioner does not contest, the district court determined that the ALJ erred in assessing the medical opinion evidence and remanded the case for further proceedings. We conclude that the ALJ also erred in discrediting Garrison’s symptom testimony, and that the district court abused its discretion in remanding for further proceedings. Applying our settled “credit-as-true” rule, we reverse the judgment below with instructions to remand this case to the ALJ for the calculation and award of benefits.

BACKGROUND

I

Karen Garrison was born in 1970. In an application for disability insurance benefits filed on September 7, 2007, she stated that she has been disabled since April 17, 2007 due to a combination of physical and mental impairments. Before her stated disability onset date, she worked as a bus driver, bus monitor, cashier, pizza cook, convenience store clerk, and customer service manager.

After Garrison’s application for benefits was denied on January 4, 2008, and after her request for reconsideration was denied on March 12, 2008, she requested a hearing before an ALJ. At the hearing, which was held on July 14, 2009, Garrison presented extensive medical records to support her claim of disability and testified at length about how her impairments affect her daily life. The ALJ also heard testimony from a vocational expert (“VE”). The evidence presented at the hearing covered the period from April 2007 to June 2009, and comprehensively addressed Garrison’s physical and mental health.

A. Physical Impairments

Since 2006, as shown by the records of Dr. Christopher Labban, her primary care provider, Garrison has suffered from a variety of physical ailments — including chronic neck and back pain, degenerative joint disease, sciatica, obesity, asthma, and herniated discs. Labban’s notes from late 2006 and early-to-mid 2007 reveal that, despite a variety of attempted treatments, many of these medical conditions persisted or worsened. Garrison’s chronic neck and back pain was an especially serious problem: nothing seemed to help, it began interfering with her daily life, and the pain started slowly creeping into her shoulders, arms, and legs. In response, Labban referred Garrison to Dr. George Wang, a neurologist. Wang started treating Garrison in September 2007 and continued treating her through June 2009, the date of the last medical record in this case.

When Wang first evaluated Garrison, he noted that she reported severe neck and back pain that radiated outward and caused other symptoms; the neck pain radiated into her upper arms and caused numbness and tingling, and the lower back pain caused a burning sensation that radiated into her legs. He observed that prolonged standing aggravated her pain, as did turning her head. In his assessment, Wang noted that MRI results confirmed that Garrison suffered from spinal disc protrusions and determined that her pain symptoms involved “radicular features.” He also remarked that Garrison had decreased muscle strength of the left triceps and decreased strength of the lower extremities more distally, symptoms that he viewed as “suggestive of cervical and lumbar radiculopathy.” Follow-up tests performed on October 1, 2007 supported his initial diagnosis of cervical and lumbar radiculopathy.

Wang further noted in September 2007 that Garrison reported experiencing what she described as “seizures,” in which her mind went blank, her body seized up, she started shaking uncontrollably, and she heard voices yelling at her. Wang theorized that she was suffering from panic attacks.

Garrison’s symptoms grew worse between September and December 2007. Wang’s treatment notes indicate that Garrison reported having experienced another “seizure” while at the grocery store, causing her to seize up, feel like most of her body was burning, slur her speech, and collapse. Garrison also stated that, at night, she felt sustained twitching in her body, “like electronic current zaps through her head.” More significant, her burning and aching neck pain continued to radiate into her shoulders at a 10/10 level of intensity; Garrison stated that “she has electric jolts going up to her head when she moves her neck” and that her arms kept falling asleep. Garrison’s lower back pain, too, rated 10/10 in intensity, and Wang observed that it was made worse by prolonged standing. Noting that Garrison was very drowsy from all of the pain medication she was taking — medication that did not effectively control her pain — Wang-cleared Garrison for epidural shots. He also modified the medications she was taking, which at the time included Tegretol, Neurontin, and Baclofen.

In November 2007, Dr. Ernest Griffith, a state agency consulting physician who never examined Garrison, completed a physical residual functional capacity assessment. His report consists of answers to a series of check-box questions and a few lines of explanation that do not reference most of Garrison’s treatment records or any of her statements. Griffith noted in his report that he did not consider any statements by Garrison’s treating and examining physicians. Griffith opined that Garrison could occasionally lift or carry 20 pounds, frequently lift or carry 10 pounds, stand and/or walk for about 6 hours in an 8-hour workday, and push or pull without limitation. He added that she could occasionally climb ramps and stairs, kneel, crouch, and crawl, and could frequently stoop. He opined that she had no limitations with respect to cold, heat, wetness, humidity, noise, and vibration, but should avoid concentrated exposure to fumes, odors, dusts, gases, poor ventilation, and hazards.

In January 2008, consistent with his treatment records and those of Labban over the prior months, Wang noted that Garrison’s symptoms of “seizures” and unbearable pain persisted unabated. Her neck and back pain still rated at 10/10 in intensity, still radiated into her arms, shoulders, and legs, and still caused a mix of burning, numbness, and tingling. Garrison also reported intense fatigue due to insomnia and frequent twitching at night, as well as ongoing “seizures” that disrupted her daily activities. One such “seizure,” in late January 2008, caused her to shake on the left side of her body and was accompanied by a spell of confusion and stuttering. These symptoms, in turn, were exacerbated by drowsiness from her medications, episodes of weakness, and a persistence in decreased muscle strength in her left arm and lower extremities.

As of February 2008, Garrison’s pain symptoms were only marginally less severe: she reported that her neck pain remained at 8/10 in intensity, with jerking, numbness, and tingling in her arms that frequently caused her to drop things, and stated that her back pain was at 7/10 in intensity, with a burning sensation that radiated into her legs. These and other symptoms often caused her to lose her balance and to lean to one side.

Summarizing his assessment of Garrison’s symptoms on February 20, 2008, Wang stated as follows in a “Pain Functional Capacity (PFC) Questionnaire” that posed a series of questions followed by check-boxes: Garrison did have pain; this pain was “moderately severe (pain seriously affects ability to function)”; and this pain was reasonably expected to result from objective clinical or diagnostic findings documented in Garrison’s medical records. Wang noted that Garrison’s pain was precipitated by changing weather, movement, overuse, stress, and cold, and was “frequently” severe enough to interfere with Garrison’s attention and concentration. Wang added that, due to her pain symptoms, Garrison “frequently” experienced deficiencies of concentration, persistence or pace that resulted in a failure to complete tasks in a timely manner.

In March 2008, Dr. Eric Feldman, to whom Garrison had been sent for a consultation, remarked that Garrison had been experiencing neck and back pain “for the past year,” and that she had suffered “a fairly severe pain throughout the neck, scapular area, thoracic and lumbar spine.” Feldman observed that neither physical therapy nor a cervical epidural steroid injection had afforded Garrison much relief, that her burning pain remained constant, and that she was still experiencing tingling in both hands and leg pain. His physical examination revealed tenderness over the cervical paraspinal musculature and facet joints, as well as over the suprascapular and periscapular musculature. Noting limited cervical range of motion, a variety of spinal issues, and limited rotation bilaterally, Feldman stated that his impressions included fibromyalgia, central hyper-algesia, and disc protrusions and disc space narrowing.

Over the next two months, Garrison’s pain improved somewhat. She had a hysterectomy, some of her other symptoms abated, and a transforaminal epidural steroid injection temporarily reduced her back and leg pain. However, she continued to experience symptoms of radiculopathy, including sharp pain in her neck, arms, and shoulders.

Wang’s records show that, by June 2008, Garrison, who had stopped a few of her medications, was again experiencing numbness, tingling, and sharp pain in her hands, arms, and neck. She had also experienced a “seizure” on June 15, 2008. Wang instructed Garrison to resume some of her medications and ordered her not to drive for at least three months. A few weeks' later, Garrison once again experienced a “seizure,” and reported to Wang that, despite the Percocet that she was taking every six hours, she was still in pain. By August 2008, Garrison was again experiencing intense pain in her neck and right shoulder — and received only partial, short-lived relief from the Percocet.

In September 2008, Feldman performed another epidural steroid shot, but this time the shot led to only a few days of relief from the again-chronic burning back pain, which soon started shooting down from Garrison’s lower back into her thighs and legs. This and other symptoms led Garrison to have trouble keeping her balance. Wang prescribed more Percocet and also told Garrison to start using a rolling walker for her gait instability. Garrison saw little improvement in the months that followed. In October 2008, even with Perco-cet three times a day for pain, she reported feeling a lot of pain radiating from the right side of her lower back into her right leg, causing numbness and tingling. Then, in November 2008, Wang noted a mild limp, tension headaches escalating to migraines, and ongoing cervical and lumbar radiculopathy.

After a short-lived respite from the back pain (though not other symptoms) in December 2008 — the result of another epidural shot — Garrison again reported severe back pain in January 2009. Feld-man’s records reveal that, by this point, Percocet had become less effective, providing partial relief for only a few hours per dose. Garrison’s neck continued “bothering her significantly,” and the pain from her neck started radiating into her occipital region. Garrison was started on new pain medications, including MS Con-tin, but continued to feel numbness and tingling in her arms and legs. These symptoms, as well as severe headaches, persisted into February 2009.

In April 2009, Wang noted that Garrison’s headaches had finally ceased, but also observed that she had experienced an increase in her back pain and that this pain was still radiating down her legs. Garrison told Wang that it felt like her feet “are on fire,” adding that she could not walk or stand for a long time and that her feet were tender to the touch. Wang opined that Garrison’s “pain is affecting her ability to function.”

Garrison’s last medical record is from June 2009. In it, Wang states as follows: “The patient states she [continues] to have constant pain to her lower back and neck rated as 6 on 0-10 pain scale. She continues to have numbness and tingling to her bilateral lower extremities, greater on the right. The patient states she continues to have some lower extremity weakness. She states her pain is aggravated by sitting and standing for long periods.” Wang once again recorded his impression that Garrison’s pain “is affecting her ability to function,” noting that her drugs were not providing adequate relief.

B. Mental Health Issues

Throughout the relevant time period, Garrison struggled with a variety of diagnosed mental impairments, including bipolar disorder, anxiety, bouts of insomnia, auditory and visual hallucinations, and paranoia. Her treating medical caretakers also came to view her “seizures” (sometimes called “pseudo-seizures” in her treatment records) as the result of psychiatric issues.

In September 2007, Garrison visited Nurse Practitioner Susan Anderson for the first time. Anderson, who would become Garrison’s primary psychiatric care giver, noted that Garrison suffered from insomnia, anxiety, depressive symptoms, nightmares, and flashbacks. Anderson diagnosed post-traumatic stress disorder and possible bipolar disorder. She also recorded a Global Assessment of Function (GAF) score of 50. Later that month, Anderson noted that Garrison had recently suffered another “seizure” and, more important, that Garrison was having trouble with her medication. Garrison stated that she felt electric shocks going through her body, and added that she alternated between feeling very depressed and feeling like she had superpowers that would let her lift piles of wood in the heat.

Two months later, in November 2007, Dr. Wayne General examined Garrison at the behest of a state agency. He concluded that her full scale IQ was 77, placing her in the 6th percentile, and observed that “Karen is currently functioning in the range of borderline intelligence.” General then noted that Garrison’s “overall short-term memory is in the borderline range” and that her “concentration is in the low average range.” When subjected to further tests, Garrison performed in “the lower average range” on simple tasks and “very poorly” on more complex tasks requiring concentration.

Assessing Garrison’s overall mental well-being, General made a number of diagnoses:

Axis I: 296.52 Bipolar I Disorder, Most Recent Episode Depressed, Moderate
995.50 Victim of Physical or Sexual Abuse as a Child
995.81 Victim of Physical or Sexual Abuse as an Adult
309.81 Posttraumatic Stress Disorder, Acute, Chronic, Delayed Onset
304.80 Polysubstance Dependence, Alcohol, Amphetamines, Cannabis and Cocaine in Full, Sustained Remission by self-report
Axis II: V62.89 Borderline Intellectual Functioning, by examination
Axis III: Overweight, joint disease (neck and back) and arthritis, by referral history; migraine cephal-gia by self-report

General concluded that Garrison’s “prognosis for returning to work is currently poor, as she had difficulty maintaining concentration and manifested a borderline short-term memory. She did not have sufficient emotional control, and broke into tears three times during test administration. Her ability to perform work-related tasks is currently inadequate based on cognitive functions such as attention, concentration, processing speed and short-term memory.”

In December 2007, Anderson noted that Garrison’s mood was “unstable,” that Garrison was dealing with several family issues, and that Garrison was experiencing intense anxiety and severe racing thoughts. Anderson assessed a GAF score of 55, with a continued diagnosis of Bipolar Disorder II and PTSD. Garrison’s attention and concentration, as well as her insight and judgment, were only “fair.” Garrison’s condition did not materially change over the next few months. In January 2008, despite slight improvement due to use of Abilify, Anderson’s records show that Garrison remained anxious and deeply paranoid, with a GAF score of 55-60.

That month, Dr. Adrianne Gallucci, Psy. D., a state agency consultant, reviewed some of Anderson’s medical records and filled out a check-box form to state her conclusions. Gallucci opined that Garrison’s impairments were “severe but not expected to last 12 months,” and identified “[cjoexisting [njonmental [i]mpairment(s) that require referral to another medical specialty.” Gallucci checked off the boxes for affective disorders, mental retardation, and anxiety-related disorders. Under affective disorders, Gallucci marked bipolar disorder. In a summary section, Gallucci checked boxes for “mild” degree of limitation of function in “restriction of activities of daily living” and “difficulties in maintaining social functioning.” Gallucci checked boxes for “moderate” degree of difficulties in “maintaining concentration, persistence, or pace.” In a brief explanation section, Gallucci remarked that Garrison had experienced a good initial response to some medication and had started mental health treatment, and for these reasons was not likely to suffer an impairment lasting more than 12 months.

Anderson’s records show that, in February 2008, Garrison’s GAF dropped to 55. In her visit, Garrison reported hearing ghosts and spirits calling her name, a variety of other auditory and visual hallucinations, nightmares, severe anxiety, obsessive preoccupations, and persistent insomnia. Her attention and concentration, as well as her insight and judgment, remained only “fair.”

That month, Anderson completed a “Medical Assessment of the Patient’s Ability to Perform Work Related Activity” (“the 2008 Assessment”). In it, Anderson reported a moderate impairment in Garrison’s ability to relate to other people; to perform daily activities; to understand, carry out, and remember instructions; to respond appropriately to supervision; to respond appropriately to co-workers; and to perform varied tasks. Anderson reported moderately severe constriction of interests, including in Garrison’s ability to respond to customary work pressures; ability to perform complex tasks; ability to complete a normal workday/workweek without interruptions from psychologically based symptoms; and ability to perform at a consistent pace without an unreasonable number/length of rest periods. Anderson checked “Yes” when asked, “Have the above limitations lasted or can they be expected to last for 12 months or longer?” Anderson also filled out the comment section of the form, writing as follows: “Client has poor coping skills, auditory hallucinations, unstable moods, and severe anxiety. These psychiatric symptoms are complicated by multiple medical problems.”

In July 2008, after a short respite from some of her more severe mental health problems — though she did experience at least one “seizure” during that respite— Garrison again returned to Anderson with hallucinations, panic attacks, insomnia, racing thoughts, blackouts, unstable mood, and paranoia. Anderson assessed GAF scores of 45-50 and determined that Garrison’s attention and concentration, as well as her insight and judgment, were “limited.” One month later, Garrison collapsed and then fainted in the middle of a counseling session, and continued to report auditory hallucinations, anxiety, and racing thoughts. Anderson’s records show that Garrison’s attention and concentration, as well as her insight and judgment, remained “limited.”

Garrison’s GAF score dropped again in September 2008, this time to 50, and Anderson again recorded issues including hallucinations, insomnia, anxiety, and racing thoughts. By late September, Garrison’s issues expanded to include overwhelming depression and paranoia. Throughout this period, her attention and concentration, as well as her insight and judgment, remained “limited.” Then, from November 2008 to February 2009, Garrison’s mental health improved somewhat. Anderson recorded GAF scores ranging from 55 to 61 in this period, noting that, although Garrison remained anxious and was at times tearful, treatment was helping to alleviate Garrison’s more severe symptoms, including her panic attacks, paranoia, and hallucinations.

As before, though, this short-lived and limited uptick in Garrison’s mental health soon ended. By April 2009, Garrison was suicidal and panicked, again reported racing thoughts and pseudo-seizures, and again stated that she was having trouble with — and had stopped some of — her medications. Late in April 2009, Anderson assessed a GAF score of 50 and noted that Garrison was displaying hypomanic symptoms, sleeping less, and had only “partial” insight and judgment.

Garrison’s GAF score remained 50 in May 2009. That month, Anderson observed that Garrison had fainted on several occasions when upset, was still experiencing insomnia and racing thoughts, and still had only partial insight and judgment. In June 2009, Garrison felt a bit better, but was still “up and down.” Her GAF score remained 50, an indication of “serious symptoms” or “serious impairment in social, occupational, or school functioning,” she was still troubled by bouts of racing thoughts and anxiety, and Anderson was still trying, apparently with only mixed success, to adjust her medications.

II

At the June 14, 2009 hearing before the ALJ, Garrison testified about how her physical and mental impairments affect her daily life. A VE also testified, mainly by answering a series of hypothetical questions.

A. Garrison’s Testimony

At the June 14, 2009 hearing, Garrison testified that she stopped working in April 2007 because she was “having problems with [her] back and [her] neck,” rising to the level of “severe back pain.” She described her neck pain as a “real sharp, burning pain,” that grew worse when she slept wrong, picked stuff up, or bent over, and that radiated into her arms. She added that neither physical therapy nor injections had helped her neck pain; to the contrary, the neck injections had caused an allergic reaction that required hospitalization. She testified that her back pain is sharp, nagging, and constant, that it shoots down to the back sides of her legs, that she cannot bend down to lift or pick things up unless she is sitting, and that the pain is exacerbated by standing or sitting for more than 20 to 30 minutes. She added that the pain is often so bad that she must lie down every three to four hours, that she naps three hours per day because she is tired from the pain, and that the pain prohibits her from picking up her daughter, her laundry, or heavy bags of groceries. Garrison stated that physical therapy did not alleviate her back pain and that the epidural injections she had received helped for only a short while. She also stated that she had gained one hundred pounds from her medication.

Turning to her mental impairments, Garrison testified, “I have a lot of anxiety” and “a lot of ups and downs and depression.” She stated that she experiences panic attacks that sometimes cause her to pass out, and that these attacks are triggered by tasks like grocery shopping alone. She avoids talking to people to prevent stress, occasionally experiences suicidal thoughts, and, when she is feeling depressed, spends days alone in her room with the light out. When she feels “up,” however, she cannot sleep for days at a time and experiences auditory hallucinations in which voices criticize her for ruining her life. Garrison noted that her “seizure” condition has improved somewhat since she started taking Prozac.

Garrison testified that her mother takes care of her, including taking her to church, to doctors, and to the grocery store, and doing her laundry. She stated that, although she can carry a five pound bag of potatoes, she cannot carry a ten pound bag (or a watermelon), and that she always carries the lightest groceries. Describing a typical day, Garrison stated: “Just get up, feed my daughter, call Griffey the turtle. Sometimes I water the plants. Get myself dressed, get her dressed. Then by the time lunchtime comes around, I’m down for a good two, three hours. And then I’ll get back up and make sure she’s had dinner. My mom watches her while I’m sleeping. I’ll make sure she has dinner, and then usually I’m back to sleep by 7:30, 8:00.”

B. The Vocational Expert’s Testimony

The VE answered a series of questions posed by the ALJ and by Garrison’s lawyer about a hypothetical person’s ability either to perform past relevant work or sustain the demands of work.

The ALJ first posed this hypothetical question:

I’m asking you to consider a hypothetical person the same age, education, and work history as Ms. Garrison. The first question is, this hypothetical person could perform work frequently lifting and carrying 10 pounds, occasionally 20 pounds; could stand and/or walk with normal breaks about six out of eight hours; sit with normal breaks about six out of eight hours; no limits in pushing or pulling; can never climb ladders, ropes, or scaffolds; can occasionally climb ramps and stairs; occasionally stoop, kneel, crouch, and crawl. This person can occasionally reach overhead. The person must avoid concentrated exposure to fumes, odors, dust, gasses, poor ventilation, and hazards. This person is restricted to performing simple work. Could this hypothetical person perform any of the past relevant work?

The VE replied that such a person could perform the past relevant work of “[t]he cashier/checker, bus monitor, teacher’s aide, and the pizza maker.”

Next, the ALJ posed this question to the VE: “I’d ask you to consider the same hypothetical person. This person frequently had deficiencies of concentration, persistence, or pace resulting in failure to complete tasks in a timely manner. Would that person be able to sustain the mental demands of work?” The VE replied, “No.”

Garrison’s lawyer then posed the following hypothetical question:

[PJlease assume an individual the claimant’s age, education, and work experience, but assume they were limited as follows.... The person had a moderately severe limitation, and that’s defined as an impairment which seriously affects the ability to function, and that would be in the following areas: ability to understand, carry out, and remember instructions; respond appropriately to supervision; respond, [sic.] respond appropriately] to co-workers; respond to customary work pressures; and to complete a normal work day and work week without interruptions from psychologically based symptoms; to perform at a consistent pace without an unreasonable number and length of rest periods. Do you agree that the cumulative effect of those moderately severe limitations would preclude both the claimant’s past relevant work and, and all other works?

The VE replied, “I would.”

Finally, Garrison’s lawyer asked, “Given the limitations testified to [by Garrison], were they credible, would you agree those would preclude both the claimant’s past relevant work and all other work?” The VE replied, “Yes.”

Ill

On October 29, 2009, the ALJ issued a decision concluding that Garrison was not disabled within the meaning of the Social Security Act. At step one of the five-step sequential evaluation process, the ALJ determined that Garrison had not engaged in substantial gainful activity since April 17, 2007, the alleged onset date. At step two, the ALJ found that Garrison had the following medically determinable severe impairments: “borderline intellectual functioning, bipolar disorder, posttraumatic stress disorders, polysubstance dependence (in sustained remission), degenerative disc disease of the lumbar, cervical and thoracic spine, obesity, and asthma.” At step three, the ALJ concluded that Garrison did not meet or medically equal any of the listed impairments in 20 C.F.R. Part 404, Subpart P, Appendix 1. At step four, the ALJ determined that Garrison has the residual functional capacity to perform the exertional requirements of light work as defined in 20 C.F.R. § 404.1567(a), adding that, due to various limitations, Garrison is limited to simple work. She concluded that Garrison is capable of performing past relevant work as a school bus monitor, cashier/checker, and pizza maker. Accordingly, the ALJ determined that Garrison had not been under a disability from April 17, 2007 through October 29, 2009.

In assessing Garrison’s residual functional capacity, the ALJ started by discrediting part of Garrison’s testimony. The ALJ concluded that, while Garrison’s medically determinable impairments could reasonably be expected to produce the alleged symptoms, Garrison’s “statements concerning the intensity, persistence, and limiting effects of these symptoms are not credible to the extent they are inconsistent with finding that the claimant has no severe impairment or combination of impairments .... ” The ALJ reasoned that, although Garrison had suffered mental impairments rendering her “non functional for a good part of the period from April 2007 to the end of 2007,” Garrison’s mental health had improved with medication. The ALJ added that Garrison’s mental health deteriorated when she stopped taking her medication, noted that Garrison’s GAF score had reached 60 in November 2008, and pointed out that Garrison told Anderson in February 2009 that she was stable with her medications.

Turning to Garrison’s testimony concerning her physical impairments, the ALJ stated that “the claimant physically improved in 2007 and 2008 with conservative medical treatment, i.e., physical therapy and epidural injections.” The ALJ also observed that Garrison helped prepare meals, cleaned her room, talked on the phone frequently, and helped care for her own daughter, activities that the ALJ deemed inconsistent with Garrison’s allegations of disability. In sum, the ALJ remarked that “there may have a[sic] short time during the adjudicatory period in which the claimant has been non-functional,” but “most of these times were when the claimant was either not taking her psychotropic medications or before she had undergone physical therapy for her neck/back pain.”

Next, the ALJ considered the medical evidence. Citing only Wang’s PFC Questionnaire from February 2008, and Anderson’s 2008 Assessment (also from February 2008), the ALJ “assigned] little weight to the medical opinions reflected in these assessments, as Dr. Wang provided no rationale for his medical conclusions and Ms. Anderson, as a nurse practitioner, is not a psychiatrist or psychologist.” The ALJ added: “In addition, their findings do not comport with the improvement the claimant consistently showed when she adhered to her medication regimen. The same critique is applied to Ms. Anderson’s most recent medical source statement dated June 23, 2009.” After rejecting Wang and Anderson’s conclusions, the ALJ gave substantial weight to the conclusions of the state agency consultants, Griffith and General, regarding Garrison’s physical and mental residual functional capacity. In the ALJ’s view, “their findings limited the claimant to range [sic] of light, simple work.” Matching this residual functional capacity assessment to the hypothetical questions posed to the VE, the ALJ concluded that Garrison could work as a bus monitor, pizza maker, and cashier/checker, and therefore rejected her application.

IV

After exhausting administrative remedies, Garrison appealed to the district court, which concluded that her case should be remanded to the ALJ for further proceedings. In reaching that result, the district court first concluded that the “panoply of reasons” given by the ALJ for rejecting Garrison’s symptom testimony was sufficiently specific, clear, and convincing.

The district court then considered the ALJ’s decision to give substantial weight to the state agency consultants, Griffith and General, rather than to Garrison’s treating medical caretakers, Wang and Anderson. Focusing, like, the ALJ, on the two cheek-box forms that Wang and Anderson completed in February 2008, the district court stated that the ALJ had “provided only a perfunctory explanation of her reasons for not giving significant weight to the opinions of those who had been treating Claimant.” Specifically, with respect to Wang, the district court found that the ALJ failed to address “any of the particulars of [Wang’s] lengthy treatment of Claimant and numerous treatment notes,” said “nothing about whether Dr. Wang’s opinion, which was based on the effect of Claimant’s pain, is consistent or inconsistent with other evidence regarding Claimant’s pain,” and offered only a “eonclusory” discussion of his treatment records. With respect to Anderson, the district court stated that the ALJ did not recognize that a nurse practitioner is an acceptable medical source under 20 C.F.R. § 404.1523(d) and that she failed to consider the factors set forth in 20 C.F.R. § 404.1527(c) in analyzing her opinion. The district court believed that, as a result of these errors, it was “unable to assess whether the ALJ properly accorded little weight” to Wang and Anderson’s opinions.

Turning to Griffith and General, the district court first noted that “Griffith had no medical records from any treating or other examining physician to review,” and agreed with Garrison that “it is not clear from the ALJ’s decision that she had an adequate basis for using Dr. Griffith’s onetime examination to displace the opinion of a treating physician such as Dr. Wang.” The district court added that the ALJ had misunderstood General’s report which, read properly, “adds to the balance in favor of finding Claimant disabled, although the fact that the evaluation was a snapshot of her condition at a particular point in time renders the evaluation standing alone an insufficient basis to determine that Claimant had a disability of the requisite duration.”

Ultimately, the district court concluded that it lacked a sufficient explanation from the ALJ of why she declined to give substantial weight to the opinions of Wang and Anderson, and why she instead credited Griffith. The district court therefore remanded to the Commissioner, stating that, if the opinions of Wang and Anderson were properly given little weight, Garrison would not be entitled to an award of benefits.

Garrison timely appealed this ruling, contending that the district court abused its discretion in remanding to the Commissioner for further proceedings instead of remanding for a calculation and award of benefits.

DISCUSSION

I

42 U.S.C. § 405(g) provides for judicial review of the Social Security Administration’s disability determinations: “The court shall have power to enter ... a judgment affirming, modifying, or reversing the decision of the Commissioner of Social Security, with or without remanding the cause for a rehearing.”

An ALJ’s disability determination should be upheld unless it contains legal error or is not supported by substantial evidence. See Stout v. Comm’r, Soc. Sec. Admin., 454 F.3d 1050, 1052 (9th Cir.2006); 42 U.S.C. §§ 405(g), 1383(c)(3). “ ‘Substantial evidence’ means more than a mere scintilla, but less than a preponderance; it is such relevant evidence as a reasonable person might accept as adequate to support a conclusion.” Lingenfelter v. Astrue, 504 F.3d 1028, 1035 (9th Cir.2007). “[W]e must consider the entire record as a whole, weighing both the evidence that supports and the evidence that detracts from the Commissioner’s conclusion, and may not affirm simply by isolating a specific quantum of supporting evidence.” Id. (citations and quotation marks omitted). “The ALJ is responsible for determining credibility, resolving conflicts in medical testimony, and for resolving ambiguities.” Andrews v. Shalala, 53 F.3d 1035, 1039 (9th Cir.1995). Where “the evidence can reasonably support either affirming or reversing a decision, we may not substitute our judgment for that of the [ALJ].” Id. (citation omitted). We review only the reasons provided by the ALJ in the disability determination and may not affirm the ALJ on a ground upon which he did not rely. See Connett v. Barnhart, 340 F.3d 871, 874 (9th Cir.2003).

“[A] district court’s decision to affirm, reverse or modify a determination of the Social Security Administration is reviewed de novo on appeal. We also review de novo a district court’s determination to remand a case to the Commissioner.” Harman v. Apfel, 211 F.3d 1172, 1174 (9th Cir.2000) (citation omitted).

II

We conclude that the ALJ erred in rejecting Wang and Anderson’s medical opinions, that she misunderstood General’s opinion of Garrison’s impairments, and that she failed to meet the requirement of offering specific, clear, and convincing reasons for discrediting Garrison’s symptom testimony.

A. The Five-Step Sequential Process

The Social Security Act defines “disability” as the inability to engage “in any substantial gainful activity by reason of any medically determinable physical or mental impairment which can be expected to result in death or which has lasted or can be expected to last for a continuous period of not less than 12 months.” 42 U.S.C. § 1382c(a)(3)(A). A claimant “shall be determined to be under a disability only if his physical or mental impairment or impairments are of such severity that he is not only unable to do his previous work but cannot, considering his age, education, and work experience, engage in any other kind of substantial gainful work which exists in the national economy, regardless of whether such work exists in the immediate area in which he lives, or whether a specific job vacancy exists for him, or whether he would be hired if he applied for work.” § 1382c(a)(3)(B).

ALJs apply a five-step evaluation process to determine whether a claimant qualifies as disabled. Ludwig v. Astrue, 681 F.3d 1047, 1048 n. 1 (9th Cir.2012). That procedure is set forth at 20 C.F.R. § 404.1520(a)(4):

(4) The five-step sequential evaluation process. The sequential evaluation process is a series of five “steps” that we follow in a set order ... If we can find that you are disabled or not disabled at a step, we make our determination or decision and we do not go on to the next step. If we cannot find that you are disabled or not disabled at a step, we go on to the next step. Before we go from step three to step four, we assess your residual functional capacity.... We use this residual functional capacity assessment at both step four and step five when we evaluate your claim at these steps. These are the five steps we follow:
(i) At the first step, we consider your work activity, if any. If you are doing substantial gainful activity, we will find that you are not disabled ...
(ii) At the second step, we consider the medical severity of your impairments). If you do not have a severe medically determinable physical or mental impairment that meets the duration requirement in § 404.1509, or a combination of impairments that is severe and meets the duration requirement, we will find that you are not disabled ...
(iii) At the third step, we also consider the medical severity of your impairment(s). If you have an impairment(s) that meets or equals one of our listings in appendix 1 of this sub-part and meets the duration requirement, we will find that you are disabled ...
(iv) At the fourth step, we consider our assessment of your residual functional capacity and your past relevant work. If you can still do your past relevant work, we will find that you are not disabled ...
(v) At the fifth and last step, we consider our assessment of your residual functional capacity and your age, education, and work experience to see if you can make an adjustment to other work. If you can make an adjustment to other work, we will find that you are not disabled. If you cannot make an adjustment to other work, we will find that you are disabled ...

20 C.F.R. § 404.1520(a). “The burden of proof is on the claimant at steps one through four, but shifts to the Commissioner at step five.” Bray v. Comm’r of Soc. Sec. Admin., 554 F.3d 1219, 1222 (9th Cir.2009).

At steps four and five, the ALJ determines a claimant’s residual functional capacity (“RFC”). 20 C.F.R. § 416.920(e). RFC is “what [one] can still do despite [one’s] limitations.” 20 C.F.R. § 416.945(a)(1). It is “based on all the relevant medical and other evidence in [the] case record.” Id. If a claimant has multiple impairments, they are all included in the assessment. § 416.920(a)(2). The ALJ must consider a claimant’s physical and mental abilities, § 416.920(b) and (c), as well as the total limiting effects caused by medically determinable impairments and the claimant’s subjective experiences of pain, § 416.920(e). The RFC is used at step four to determine if a claimant can do past relevant work and at step five to determine if a claimant can adjust to other work. Id.'

If, at step four, “a claimant shows that he or she cannot return to his or her previous job, the burden of proof shifts to the Secretary to show that the claimant can do other kinds of work.” Embrey v. Bowen, 849 F.2d 418, 422 (9th Cir.1988). Thus, “[a]t step five, the ALJ can call upon a [VE] to testify as to: (1) what jobs the claimant, given his or her [RFC], would be able to do; and (2) the availability of such jobs in the national economy.” Tackett v. Apfel, 180 F.3d 1094, 1101 (9th Cir.1999). The ALJ may pose hypothetical questions to the expert that “set out all of the claimant’s impairments” for the VE’s consideration. Gamer v. Secretary of Health and Human Servs., 815 F.2d 1275, 1279 (9th Cir.1987). “The ALJ’s depiction of the claimant’s disability must be accurate, detailed, and supported by the medical record.” Tackett, 180 F.3d at 1101 (citation omitted). “The testimony of a [VE] is valuable only to the extent that it is supported by medical evidence” and has “no evidentiary value if the assumptions in the hypothetical are not supported by the record.” Magallanes v. Bowen, 881 F.2d 747, 756 (9th Cir.1989) (citations omitted). “The [VE] then ‘ “translates [these] factual scenarios into realistic job market probabilities” by testifying on the record to what kinds of jobs the claimant still can perform and whether there is a sufficient number of those jobs available in the claimant’s region or in several other regions of the economy to support a finding of “not disabled.” ’ ” Tackett, 180 F.3d at 1101 (citations omitted) (second alteration in the original).

B. The ALJ Erred in Assigning Little Weight to Wang and Anderson’s Opinions, and Misunderstood General’s Opinion 1. Applicable Law

“In disability benefits cases ... physicians may render medical, clinical opinions, or they may render opinions on the ultimate issue of disability — the claimant’s ability to perform work.” Reddick v. Chater, 157 F.3d 715, 725 (9th Cir.1998) (citation omitted). “In conjunction with the relevant regulations, we have ... developed standards that guide our analysis of an ALJ’s weighing of medical evidence.” Ryan v. Comm’r of Soc. Sec., 528 F.3d 1194, 1198 (9th Cir.2008). Specifically, we “distinguish among the opinions of three types of physicians: (1) those who treat the claimant (treating physicians); (2) those who examine but do not treat the claimant (examining physicians); and (3) those who neither examine nor treat the claimant (nonexamining physicians).” Lester v. Chater, 81 F.3d 821, 830 (9th Cir.1995). “As a general rule, more weight should be given to the opinion of a treating source than to the opinion of doctors who do not treat the claimant.” Id. (citing Winans v. Bowen, 853 F.2d 643, 647 (9th Cir.1987)). While the opinion of a treating physician is thus entitled to greater weight than that of an examining physician, the opinion of an examining physician is entitled to greater weight than that of a non-examining physician. See Ryan, 528 F.3d at 1198. “The weight afforded a non-examining physician’s testimony depends ‘on the degree to which [he] provide[s] supporting explanations for [his] opinions.’ ” Id. (quoting § 404.1527(d)(3)).

“If a treating or examining doctor’s opinion is contradicted by another doctor’s opinion, an ALJ may only reject it by providing specific and legitimate reasons that are supported by substantial evidence.” Id. This is so because, even when contradicted, a treating or examining physician’s opinion is still owed deference and will often be “entitled to the greatest weight ... even if it does not meet the test for controlling weight.” Orn v. Astrue, 495 F.3d 625, 633 (9th Cir.2007). An ALJ can satisfy the “substantial evidence” requirement by “setting out a detañed and thorough summary of the facts and conflicting clinical evidence, stating his interpretation thereof, and making findings.” Reddick, 157 F.3d at 725. “The ALJ must do more than state conclusions. He must set forth his own interpretations and explain why they, rather than the doctors’, are correct.” Id. (citation omitted).

Where an ALJ does not explicitly reject a medical opinion or set forth specific, legitimate reasons for crediting one medical opinion over another, he errs. See Nguyen v. Chater, 100 F.3d 1462, 1464 (9th Cir.1996). In other words, an ALJ errs when he rejects a medical opinion or assigns it little weight while doing nothing more than ignoring it, asserting without explanation that another medical opinion is more persuasive, or criticizing it with boilerplate language that fails to offer a substantive basis for his conclusion. See id.

2. Application of Law to Fact

In a ruling that the government does not contest on appeal, the district court correctly concluded that the ALJ did not offer specific and legitimate reasons supported by substantial evidence for rejecting the testimony of Wang and Anderson. The district court also correctly concluded that the ALJ misunderstood General’s opinion regarding Garrison’s mental health. We briefly address this issue because the ALJ’s significant and numerous errors in evaluating the medical opinion evidence help clarify why these opinions are, in fact, worthy of substantial weight and why it is appropriate to remand this .case for an award of benefits.

In evaluating Wang’s testimony, the ALJ committed a variety of egregious and important errors: (1) she entirely ignored most of his treatment records, including reports from Dr. Feldman, dozens of medical test results, and Wang’s own treatment notes; (2) she failed to recognize that the opinions expressed in check-box form in the February 2008 PFC Questionnaire were based on significant experience with Garrison and supported by numerous records, and were therefore entitled to weight that an otherwise unsupported and unexplained check-box form would not merit; (3) she did not explicitly compare Wang’s records to other medical evidence — and therefore failed to recognize that no other treating or examining physician disagreed with Wang, and that Griffith, the consultant whose views differed from Wang’s, wrote his check-box report early in November 2007, very early in Garrison’s course of treatment, and admitted in his report that he lacked access to Garrison’s treatment records and statements; (4) she did not evaluate Wang’s records for internal consistency or inconsistency in his description of Garrison’s symptoms, an evaluation that would have disclosed consistent reports of burning, tingling, and numbness radiating from her back and neck into her extremities, causing weakness and intense pain; (5) she did not recognize that because Wang is a specialist, his opinion is owed greater weight as a matter of regulation ; (6) more generally, she failed to afford the deference to which Wang was presumptively entitled under both Social Security regulations and our precedent as Garrison’s treating physician; and (7) she manufactured a conflict with respect to the outcome of treatment by asserting that Wang’s records showed “consis-tente]” improvement, when in fact they show consistent cervical and lumbar radi-culopathy that responded only very briefly and partially to treatment.

The ALJ also committed a variety of errors in evaluating Anderson’s opinion evidence. Most important, she did not recognize that Anderson, as a nurse practitioner, qualified as an “other source[ ]” that can provide evidence about the severity of [a claimant’s] impairment(s) and how it affects [the claimant’s] ability to work. 20 C.F.R. § 404.1513(d). Further, the ALJ committed many of the same errors in assessing Anderson’s reports that she did in assessing Wang’s. For example, the ALJ assigned little weight to Anderson’s 2008 and 2009 summary reports, both of which stated that Garrison suffered a variety of moderate and moderately severe impairments expected to last more than a year, while apparently failing to recognize that those reports were supported by voluminous notes. The ALJ also manufactured a conflict by identifying two or three reports of improvement in Garrison’s mental health and asserting, without reference to any other treatment records or any other explanation, that Anderson’s considered conclusions about Garrison’s overall prognosis merited little weight.

Finally, the ALJ completely misunderstood General’s report. Whereas the ALJ described it as supporting a finding that Garrison is not disabled and is capable of “light, simple work,” General expressly stated that Garrison’s “prognosis for returning to work is currently poor,” “she does not have sufficient emotional control,” and “her ability to perform work-related tasks is currently inadequate based on cognitive functions such as attention, concentration, processing-speed and short-term memory.” General, an examining physician, bolstered these conclusions with diagnoses including bipolar disorder, PTSD, and borderline intellectual functioning. The ALJ’s belief that General’s report weighed against a finding of disability was mistaken.

C. The ALJ failed to offer specific, clear, and convincing reasons for discrediting Garrison’s symptom testimony

The ALJ discredited, to the extent it was inconsistent with her finding that Garrison is not disabled, Garrison’s testimony about the intensity, persistence, and pace of her symptoms. In doing so, the ALJ erred.

1. Applicable Law

An ALJ engages in a two-step analysis to determine whether a claimant’s testimony regarding subjective pain or symptoms is credible. “First, the ALJ must determine whether the claimant has presented objective medical evidence of an underlying impairment ‘which could reasonably be expected to produce the pain or other symptoms alleged.’” Lingenfelter, 504 F.3d at 1085-36 (quoting Bunnell v. Sullivan, 947 F.2d 341, 344 (9th Cir.1991) (en banc) (internal quotation marks omitted)). In this analysis, the claimant is not required to show “that her impairment could reasonably be expected to cause the severity of the symptom she has alleged; she need only show that it could reasonably have caused some degree of the symptom.” Smolen v. Chater, 80 F.3d 1273, 1282 (9th Cir.1996). Nor must a claimant produce “objective medical evidence of the pain or fatigue itself, or the severity thereof.” Id.

If the claimant satisfies the first step of this analysis, and there is no evidence of malingering, “the ALJ can reject the claimant’s testimony about the severity of her symptoms only by offering specific, clear and convincing reasons for doing so.” Smolen, 80 F.3d at 1281; see also Robbins v. Soc. Sec. Admin., 466 F.3d 880, 883 (9th Cir.2006) (“[U]nless an ALJ makes a finding of malingering based on affirmative evidence thereof, he or she may only find an applicant not credible by making specific findings as to credibility and stating clear and convincing reasons for each.”). This is not an easy requirement to meet: “The clear and convincing standard is the most demanding required in Social Security cases.” Moore v. Comm’r of Soc. Sec. Admin., 278 F.3d 920, 924 (9th Cir.2002).

2. Application of Law to Fact

Garrison testified about her physical and mental health. We separately address the ALJ’s grounds for discrediting each part of Garrison’s testimony.

a. Garrison’s Pain Testimony

The ALJ rejected Garrison’s pain testimony on two grounds: (1) Garrison improved in 2007 and 2008 with the “conservative” treatments of epidural injections and physical therapy; and (2) Garrison engaged in daily activities including talking on the phone, preparing meals, cleaning her room, and helping to care for her daughter.

The first of these reasons is belied by the evidence and must be rejected. Garrison’s medical records show that physical therapy afforded her only partial and short-lived relief of her lower back pain, and no effective relief for her radiating neck pain. Turning to the epidural shots, Wang and Feldman’s records make clear that epidural shots never provided Garrison any relief for her neck pain, and that they relieved Garrison’s back pain for only variable, brief periods of time, ranging from a couple of months to a few days. The other treatments prescribed by Wang, including pain pills, caused side effects including intense sleepiness and drowsiness and, even when taken several times per day, provided only limited periods of relief from the otherwise-constant pain.

In sum, there is no support in the record for the ALJ’s belief that physical therapy and epidural shots alleviated Garrison’s pain enough that her testimony regarding pain was incredible. To the contrary, the record shows that, despite Wang’s efforts, Garrison’s neck and back pain, which radiated into her shoulders,. arms, and legs, persisted largely unabated after April 2007. Garrison’s testimony that she was disabled by near-constant, intense pain is consistent with this evidence, as is her testimony that she must take frequent rests, has difficulty standing or sitting for more than 20 to 30 minutes, and cannot carry heavy items.

The ALJ’s other reason for discrediting Garrison’s testimony was its supposed inconsistency with her reported daily activities, which, again, included talking on the phone, preparing meals, cleaning her room, and helping to care for her daughter. The ALJ committed two errors here. First, she mischaracterized Garrison’s testimony. Garrison repeatedly emphasized that in performing many daily tasks, including caring for her daughter, she was heavily assisted by her mother. She also made clear that she is regularly prohibited by her pain from engaging in activities such as doing laundry, picking up her daughter, and carrying bags that weigh more than a few pounds. Finally, Garrison testified that after performing such activities, she often must rest, leading her to nap several hours per day.

Second, the ALJ erred in finding that these activities, if performed in the manner that Garrison described, are inconsistent with the pain-related impairments that Garrison described in her testimony. We have repeatedly warned that ALJs must be especially cautious in concluding that daily activities are inconsistent with testimony about pain, because impairments that would unquestionably preclude work and all the pressures of a workplace environment will often be consistent with doing more than merely resting in bed all day. See, e.g., Smolen, 80 F.3d at 1287 n. 7 (“The Social Security Act does not require that claimants be utterly incapacitated to be eligible for benefits, and many home activities may not be easily transferable to a work environment where it might be impossible to rest periodically or take medication.” (citation omitted)); Fair v. Bowen, 885 F.2d 597, 603 (9th Cir.1989) (“[M]any home activities are not easily transferable to what may be the more grueling environment of the workplace, where it might be impossible to periodically rest or take medication.”). Recognizing that “disability claimants should not be penalized for attempting to lead normal lives in the face of their limitations,” we have held that “[o]nly if [her] level of activity were inconsistent with [a claimant’s] claimed limitations would these activities have any bearing on [her] credibility.” Reddick v. Chafer, 157 F.3d at 722 (citations omitted); see also Bjornson v. Astrue, 671 F.3d 640, 647 (7th Cir.2012) (“The critical differences between activities of daily living and activities in a full-time job are that a person has more flexibility in scheduling the former than the latter, can get help from other persons ..., and is not held to a minimum standard of performance, as she would be by an employer. The failure to recognize these differences is a recurrent, and deplorable, feature of opinions by administrative law judges in social security disability cases.” (citations omitted)).

Here, Garrison’s daily activities, as she described them in her testimony, were consistent with her statements about the impairments caused by her pain. The ability to talk on the phone, prepare meals once or twice a day, occasionally clean one’s room, and, with significant assistance, care for one’s daughter, all while taking frequent hours-long rests, avoiding any heavy lifting, and lying in bed most of the day, is consistent with the pain that Garrison described in her testimony. It is also consistent with an inability to function in a workplace environment. Accordingly, the supposed inconsistencies between Garrison’s daily activities and her testimony do not satisfy the requirement of a clear, convincing, and specific reason to discredit Garrison’s testimony regarding her pain-related impairments.

b. Garrison’s Mental Health Testimony

The ALJ discredited Garrison’s mental health testimony mainly on the ground that the record showed that Garrison’s condition had improved due. to medication at a few points between April 2007 and June 2009. The ALJ added that some of Garrison’s mental impairments were caused by Garrison going off her medication. These are not clear, convincing, and specific grounds for rejecting Garrison’s testimony that, since April 2007, she had suffered panic attacks, “a lot of ups and downs and depression,” severe anxiety, occasional suicidal thoughts, and bouts of paranoia and mania — symptoms that caused major difficulties with social functioning and responding to such stresses as shopping unaccompanied for groceries.

As we have emphasized while discussing mental health issues, it is error to reject a claimant’s testimony merely because symptoms wax and wane in the course of treatment. Cycles of improvement and debilitating symptoms are a common occurrence, and in such circumstances it is error for an ALJ to pick out a few isolated instances of improvement over a period of months or years and to treat them as a basis for concluding a claimant is capable of working. See, e.g., Holohan v. Massanari, 246 F.3d 1195, 1205 (9th Cir.2001) (“[The treating physician’s] statements must be read in context of the overall diagnostic picture he draws. That a person who suffers from severe panic attacks, anxiety, and depression makes some improvement does not mean that the person’s impairments no longer seriously affect her ability to function in a workplace.”). Reports of “improvement” in the context of mental health issues must be interpreted with an understanding of the patient’s overall well-being and the nature of her symptoms. See Ryan, 528 F.3d at 1200-01 (“Nor are the references in [a doctor’s] notes that Ryan’s anxiety and depression were ‘improving’ sufficient to undermine the repeated diagnosis of those conditions, or [another doctor’s] more detailed report.”). They must also be interpreted with an awareness that improved functioning while being treated and while limiting environmental stressors does not always mean that a claimant can function effectively in a workplace. See, e.g., Hutsell, 259 F.3d at 712 (“We also believe that the Commissioner erroneously relied too heavily on indications in the medical record that Hutsell was ‘doing well,’ because doing well for the purposes of a treatment program has no necessary relation to a claimant’s ability to work or to her work-related functional capacity.”). Caution in making such an inference is especially appropriate when no doctor or other medical expert has opined, on the basis of a full review of all relevant records, that a mental health patient is capable of working or is prepared to return to work. Cf. Rodriguez v. Bowen, 876 F.2d 759, 763 (9th Cir.1989) (“The ALJ’s conclusion that Rodriguez was responding to treatment also does not provide a clear and convincing reason for disregarding Dr. Pettinger’s opinion. No physician opined that any improvement would allow Rodriguez to return to work.”).

These rules clarify the nature of the ALJ’s error. Rather than describe Garrison’s symptoms, course of treatment, and bouts of remission, and thereby chart a course of improvement, the ALJ improperly singled out a few periods of temporary well-being from a sustained period of impairment and relied on those instances to discredit Garrison. While ALJs obviously must rely on examples to show why they do not believe that a claimant is credible, the data points they choose must in fact constitute examples of a broader development to satisfy the applicable “clear and convincing” standard. Here, the record reveals a tortuous path: some symptoms came and went (e.g., paranoia, hallucinations, pseudo-seizures), some symptoms persisted nearly the whole period (e.g., insomnia, bouts of depression and mania), and still other symptoms appear to have remained a constant source of impairment (e.g., intense anxiety). Garrison’s diagnoses of PTSD and bipolar disorder remained constant across all treatment records, and her GAF score consistently hovered around 50 to 55. She remained in this condition even while going to great lengths to minimize stres-sors in her life — to the point that she could not go to the grocery store alone— and, when she did try to work for a brief period, was fired because of her mental impairments. The ALJ erred in concluding that a few short-lived periods of temporary improvement in Garrison’s mental health symptoms undermined Garrison’s testimony.

c. Conclusion

The ALJ did not offer specific, clear, and convincing reasons for rejecting Garrison’s testimony concerning her physical and mental impairments. In fact, the reasons given by the ALJ not only fail this demanding standard, but also would fail a far more forgiving inquiry, as they are plainly belied by the record and rest upon mischaracterizations of Garrison’s testimony.

Ill

Reviewing for abuse of discretion, see Harman, 211 F.3d at 1173, we reverse the district court’s decision to remand this case to the ALJ for further proceedings, and instead remand to the district court with instructions to remand to the ALJ for a calculation and award of appropriate benefits.

A. Applicable Law

Usually, “[i]f additional proceedings can remedy defects in the original administrative proceeding, a social security case should be remanded.” Lewin v. Schweiker, 654 F.2d 631, 635 (9th Cir.1981). The Social Security Act, however, makes clear that courts are empowered to affirm, modify, or reverse a decision by the Commissioner “with or without remanding the cause for a rehearing.” 42 U.S.C. § 405(g) (emphasis added). Accordingly, every Court of Appeals has recognized that in appropriate circumstances courts are free to reverse and remand a determination by the Commissioner with instructions to calculate and award benefits. See, e.g., Gentry v. Comm’r of Soc. Sec., 741 F.3d 708, 730 (6th Cir.2014); Jones v. Astrue, 650 F.3d 772 (D.C.Cir.2011); Punzio v. Astrue, 630 F.3d 704, 713 (7th Cir.2011); Salazar v. Barnhart, 468 F.3d 615, 626 (10th Cir.2006); Hines v. Barnhart, 453 F.3d 559, 567 (4th Cir.2006); Seavey v. Barnhart, 276 F.3d 1 (1st Cir.2001); Williams v. Apfel 204 F.3d 48 (2d Cir.2000); McQueen v. Apfel, 168 F.3d 152, 156 (5th Cir.1999); Davis v. Shalala, 985 F.2d 528, 534 (11th Cir.1993); Podedworny v. Harris, 745 F.2d 210, 221-22 (3d Cir.1984); Parsons v. Heckler, 739 F.2d 1334, 1341 (8th Cir.1984). Courts have generally exercised this power when it is clear from the record that a claimant is entitled to benefits, observing on occasion that inequitable conduct on the part of the Commissioner can strengthen, though not control, the case for such a remand.

This Circuit clarified the scope of judicial power to remand for an award of benefits in Varney v. Sec’y of Health & Human Servs., 859 F.2d 1396 (9th Cir.1988) (“Varney II ”). There, we held that “where there are no outstanding issues that must be resolved before a proper disability determination can be made, and where it is clear from the administrative record that the ALJ would be required to award benefits if the claimant’s excess pain testimony were credited, we will not remand solely to allow the ALJ to make specific findings regarding that testimony. Rather, we will ... take that testimony to be established as true.” Id. at 1401. We explained that this credit-as-true rule is designed to achieve fairness and efficiency:

We believe [that this] rule promotes the objectives we have identified in prior disability cases. Requiring the ALJs to specify any factors discrediting a claimant at the first opportunity helps to improve the performance of the ALJs by discouraging them from reaching a conclusion first, and then attempting to justify it by ignoring competent evidence in the record that suggests an opposite result. It helps to ensure that pain testimony will be carefully assessed and its importance recognized. Moreover, it avoids unnecessary duplication in the administrative hearings and reduces the administrative burden caused by requiring multiple proceedings in the same case. Perhaps most important, by ensuring that credible claimants’ testimony is accepted the first time around, the rule reduces the delay and uncertainty often found in this area of the law, and ensures that deserving claimants will receive benefits as soon as possible. As already noted, applicants for disability benefits often suffer from painful and debilitating conditions, as well as severe economic hardship. Delaying the payment of benefits by requiring multiple administrative proceedings that are du-plicative and unnecessary only serves to cause the applicant further damage— financial, medical, and emotional. Such damage can never be remedied. Without endangering the integrity of the disability determination process, a principal goal of that process must be the speedy resolution of disability applicants’ claims. At the same time, the rule does not unduly burden the ALJs, nor should it result in the wrongful award of benefits ... [I]f grounds for [concluding that a claimant is not disabled] exist, it is both reasonable and desirable to require the ALJ to articulate them in the original decision.

Id. at 1398-99 (quotation marks, citations, and alterations omitted). In light of these concerns, we noted, “[w]here remand would unnecessarily delay the receipt of benefits, judgment for the claimant is appropriate.” Id. at 1399. One year later, in Hammock v. Bowen, we held that the credit-as-true rule applies to medical opinion evidence, not only claimant testimony. See 879 F.2d 498 (9th Cir.1989).

Since Varney II, we have applied the credit-as-true rule in nearly two dozen published opinions. In those cases, we have developed a workable and stable framework for applying the credit-as-true rule. Specifically, we have devised a three-part credit-as-true standard, each part of which must be satisfied in order for a court to remand to an ALJ with instructions to calculate and award benefits: (1) the record has been fully developed and further administrative proceedings would serve no useful purpose; (2) the ALJ has failed to provide legally sufficient reasons for rejecting evidence, whether claimant testimony or medical opinion; and (3) if the improperly discredited evidence were credited as true, the ALJ would be required to find the claimant disabled on remand. See Ryan, 528 F.3d at 1202; Lingenfelter, 504 F.3d at 1041; Orn, 495 F.3d at 640; Benecke v. Barnhart, 379 F.3d 587, 595 (9th Cir.2004); Smolen, 80 F.3d at 1292.

We have, in a number of cases, stated or implied that it would be an abuse of discretion for a district court not to remand for an award of benefits when all of these conditions are met. See, e.g., Lingenfelter, 504 F.3d at 1041; Orn, 495 F.3d at 640; McCartey v. Massanari, 298 F.3d 1072, 1076-77 (9th Cir.2002); Harman, 211 F.3d at 1178; Smolen, 80 F.3d at 1292; Lester, 81 F.3d at 834; Ramirez v. Shalala, 8 F.3d 1449, 1455 (9th Cir.1993); Swenson v. Sullivan, 876 F.2d 683, 689 (9th Cir.1989). In the 2003 case of Connett v. Barnhart, 340 F.3d 871 (9th Cir.2003), however, we cautioned that the credit-as-true rule may not be dispositive of the remand question in all cases. Rather, recognizing that this rule, like most, admits of exceptions meant to preserve the rule’s purpose, we noted that the credit-as-true doctrine envisions “some flexibility.” Id. at 876. Connett then concluded that under the circumstances there present a remand for an award of benefits was not mandatory and remanded for further proceedings. Connett, however, did not address when such flexibility is appropriately exercised — in other words, it did not explain when remand for further proceedings rather than for an award of benefits would be appropriate even though the credit-as-true rule’s conditions are met. We have applied the credit-as-true rule in a number of cases since Connett, but have not in a published opinion exercised the “flexibility” noted in that decision, nor have we clarified the nature of the “flexibility” that we there described. See, e.g., Lingenfelter, 504 F.3d at 1041; Orn, 495 F.3d at 640; Benecke, 379 F.3d at 595; Moisa, 367 F.3d at 887. We do so now.

Recalling that, in social security cases, “the required analysis centers on what the record evidence shows about the existence or non-existence of a disability,” Strauss v. Comm’r of the Soc. Sec. Admin., 635 F.3d 1135, 1138 (9th Cir.2011), Cornett’s “flexibility” is properly understood as requiring courts to remand for further proceedings when, even though all conditions of the credit-as-true rule are satisfied, an evaluation of the record as a whole creates serious doubt that a claimant is, in fact, disabled. That interpretation best aligns the credit-as-true rule, which preserves efficiency and fairness in a process that can sometimes take years before benefits- are awarded to needy claimants, with the basic requirement that a claimant be disabled in order to receive benefits. Thus, when we conclude that a claimant is otherwise entitled to an immediate award of benefits under the credit-as-true analysis, Connett allows flexibility to remand for further proceedings when the record as a whole creates serious doubt as to whether the claimant is, in fact, disabled within the meaning of the Social Security Act.

As we explain infra, here the district court abused its discretion by remanding for further proceedings where the credit-as-true rule is satisfied and the record afforded no reason to believe that Garrison is not, in fact, disabled.

B. Application of Law to Fact

Garrison unquestionably satisfies all three conditions of the credit-as-true rule. First, there is no need to develop the record or convene further administrative proceedings. Although the Commissioner argues that further proceedings would serve the “useful purpose” of allowing the ALJ to revisit the medical opinions and testimony that she rejected for legally insufficient reasons, our precedent and the objectives of the credit-as-true rule foreclose the argument that a remand for the purpose of allowing the ALJ to have a mulligan qualifies as a remand for a “useful purpose” under the first part of credit-as-true analysis. See Benecke, 379 F.3d at 595 (“Allowing the Commissioner to decide the issue again would create an unfair ‘heads we win; tails, let’s play again’ system of disability benefits adjudication.”); Moisa, 367 F.3d at 887 (“The Commissioner, having lost this appeal, should not have another opportunity to show that Moisa is not credible any more than Moisa, had he lost, should have an opportunity for remand and further proceedings to establish his credibility.” (citation omitted)).

Second, as we have already explained at length, the ALJ failed to provide a legally sufficient reason to reject Garrison’s testimony and the opinions of her treating and examining medical caretakers. We need not repeat this analysis here.

Third, if the improperly discredited evidence were credited as true, it is clear that the ALJ would be required to find Garrison disabled on remand. Our conclusion follows directly from our analysis of the ALJ’s errors and the strength of the improperly discredited evidence, which we credit as true: a treating doctor, a treating nurse practitioner, and an examining psychologist all deemed Garrison to be disabled, Garrison testified to an array of severe physical and mental impairments, and a VE explicitly testified that a person with the impairments described by Garrison or her medical caretakers could not work. Accordingly, Garrison satisfies the requirements of the credit-as-true standard.

Having concluded that Garrison satisfies all three parts of credit-as-true analysis, we now turn to the question whether we should nonetheless exercise “flexibility” under Connett and remand for further proceedings. Here, the answer is clearly no. The Commissioner simply repeats all of the arguments she has already made, asserting that the evidence provided by the doctors and nurse practitioner who treated Garrison should not be given much weight and that Garrison’s testimony should not be accepted. As before, she dwells on the bare handful of records showing slight improvement in Garrison’s condition. At no point does she advance any argument against this evidence that we have not already carefully considered and rejected. Nor does she point to anything in the record that the ALJ overlooked and explain how that evidence casts into serious doubt Garrison’s claim to be disabled. We have independently reviewed the entire record and also have found nothing that would create doubt as to Garrison’s entitlement to the benefits she seeks. The record reflects that, since April 2007, Garrison has been afflicted with a number of severe impairments, including burning back pain that radiates into her legs, sharp neck pain that radiates into her shoulders and arms, intense anxiety and panic attacks, bipolar disorder, PTSD, and bouts of hallucinations, paranoia, and social phobia. Even if some of these symptoms have occasionally abated for brief periods of time — all while Garrison is in ongoing treatment and has significantly minimized environmental stressors — we, like her numerous medical caretakers, see no reason to doubt that she has been entirely incapable of work since April 2007. Thus, considering the Commissioner’s arguments and independently reviewing the record, we see no basis for serious doubt that Garrison is disabled.

In sum, we conclude that Garrison satisfies all three conditions of the credit-as-true rule and that a careful review of the record discloses no reason to seriously doubt that she is, in fact, disabled. A remand for a calculation and award of benefits is therefore required under our credit-as-true precedents.

CONCLUSION

We conclude that the ALJ erred in assigning little weight to Wang and Anderson’s opinions, erred in her characterization of General’s opinion, and failed to offer specific, clear, and convincing reasons for discrediting part of Garrison’s testimony. We further conclude that the district court abused its discretion in remanding for further proceedings. We reverse the judgment of the district court with instructions to remand to the ALJ for the calculation and award of benefits.

REVERSED AND REMANDED 
      
      . In early 2007, a number of MRI scans confirmed disc protrusions in Garrison’s spine. Around the same time, Garrison briefly received physical therapy, the treatment records for which assessed "signs and symptoms consistent [with] low back pain due to pelvic instability and deceased proximal musculature strength/stability.” Physical therapy provided only partial relief, however, and Garrison was forced to stop attending after four sessions due to insurance restrictions and financial troubles.
     
      
      . Some of Wang’s treatment records are signed by Nurse Practitioner Laura Kinney, with an advisement that she was acting under his direction or supervision in preparing the notes.
     
      
      . In February 2008, an EEG test largely ruled out seizures as the cause of Garrison's spells of jerking, stuttering, confusion, and hallucinations.
     
      
      . "A GAF score is a rough estimate of an individual’s psychological, social, and occupational functioning used to reflect the individual’s need for treatment.” Vargas v. Lambert, 159 F.3d 1161, 1164 n. 2 (9th Cir.1998). According to the DSM-IV, a GAF score between 41 and 50 describes “serious symptoms” or "any serious impairment in social, occupational, or school functioning.” A GAF score between 51 to 60 describes "moderate symptoms” or any moderate difficulty in social, occupational, or school functioning.” Although GAF scores, standing alone, do not control determinations of whether a person’s mental impairments rise to the level of a disability (or interact with physical impairments to create a disability), they may be a useful measurement. We note, however, that GAF scores are typically assessed in controlled, clinical settings that may differ from work environments in important respects. See, e.g., Titles II & XVI: Capability to Do Other Work-Themedical-Vocational Rules As A Framework for Evaluating Solely Nonexertional Impairments, SSR 85-15, 1983-1991 Soc. Sec. Rep. Serv. 343 (S.S.A 1985) (“The mentally impaired may cease to function effectively when facing such demands as getting to work regularly, having their performance supervised, and remaining in the workplace for a full day.”).
     
      
      . From February 2008 to June 2008, while on a break from some of her medications, Garrison worked four to five times as an "on call” teacher’s aide and also worked a few hours per day as a bus monitor. She was fired by the school district after having a "seizure” on the bus.
     
      
      . We describe the five-step sequential process at greater length infra. For a summary of the process, see Kennedy v. Colvin, 738 F.3d 1172, 1175 (9th Cir.2013) (“The five-step process for disability determinations begins, at the first and second steps, by asking whether a claimant is engaged in ‘substantial gainful activity’ and considering the severity of the claimant’s impairments. See 20 C.F.R. § 416.920(a)(4)(i)-(ii). If the inquiry continues beyond the second step, the third step asks whether the claimant’s impairment or combination of impairments meets or equals a listing under 20 C.F.R. pt. 404, subpt. P, app. 1 _ and meets the duration requirement. See id. § 416.920(a)(4)(iii). If so, the claimant is considered disabled and benefits are awarded, ending the inquiry. See id. If the process continues beyond the third step, the fourth and fifth steps consider the claimant’s 'residual functional capacity' in determining whether the claimant can still do past relevant work or make an adjustment to other work. See id. § 416.920(a)(4)(iv)-(v).”).
     
      
      . The ALJ noted that Garrison had worked four to five times as a teacher’s aide and part time as a bus monitor in early 2008, but concluded that ”[t]his work activity does not rise to the level of presumptive monthly SGA under the Regulations.”
     
      
      . Without concluding that Garrison was a malingerer, the ALJ noted a single inconsistency in Garrison’s testimony: whereas Garrison claimed to have gained one hundred pounds while on medication, the medical records showed that Garrison had gained only approximately twenty pounds.
     
      
      . At the time of the district court’s ruling, the relevant factors were set forth in 20 C.F.R. § 404.1527(d). Additionally, the district court should have cited to 20 C.F.R. § 1513(d), which permits consideration of evidence from a nurse practitioner as an "other source,” rather than to 20 C.F.R. § 1523(d).
     
      
      . See also 20 C.F.R. § 404.1527(c)(2) (“If we find that a treating source's opinion on the issue(s) of the nature and severity of your impairment(s) is well-supported by medically acceptable clinical and laboratory diagnostic techniques and is not inconsistent with the other substantial evidence in your case record, we will give it controlling weight.”).
     
      
      . Social Security regulations provide that, when a treating source’s opinions are not given controlling weight, ALJs must apply the factors set forth in 20 C.F.R. § 404.1527(c)(2)(i-ii) and (c)(3-6) in determining how much weight to give each opinion. These factors are length of the treatment relationship and the frequency of examination, § 404.1527(c)(2)(i), nature and extent of the treatment relationship, § 404.1527(c)(2)(ii), "supportability,” § 404.1527(c)(3), consistency, § 404.1527(c)(4), specialization, § 404.1527(c)(5), and other factors that tend to support or contradict the opinion, § 404.1527(c)(6).
     
      
      . See 20 C.F.R. § 404.1527(d)(3) ("The better an explanation a source provides for an opinion, the more weight we will give that opinion.”).
     
      
      . See 20 C.F.R. § 404.1517(c)(4).
     
      
      . See 20 C.F.R. § 404.1517(c)(5).
     
      
      . See 20 C.F.R. § 404.1517(c)(2).
     
      
      .Further, there is no evidence that anyone other than the ALJ thought that Garrison's reports to Wang were not credible; Wang never indicated his belief that Garrison was exaggerating or lying in her self-reported pain symptoms. Cf. Tommasetti v. Astrue, 533 F.3d 1035, 1041 (9th Cir.2008).
     
      
      . With respect to both Wang and Anderson's opinions, the Commissioner suggests that the ALJ was entitled to reject their opinions on the ground that they were reflected in mere check-box forms- — e.g., Wang's 2008 PFC Questionnaire and Anderson’s 2008 and 2009 Assessments. This argument rests on a mistaken factual premise. The check-box forms did not stand alone: they reflected and were entirely consistent with the hundreds of pages of treatment notes created by Wang and Anderson in the course of their relationship with Garrison.
     
      
      . The government’s suggestion that we should apply a lesser standard than “clear and convincing” lacks any support in precedent and must be rejected.
     
      
      . Garrison was forced to discontinue physical therapy early in 2007 because she could not afford it. See Smolen, 80 F.3d at 1284 ("Where a claimant provides evidence of a good reason for not taking medication for her symptoms, her symptom testimony cannot be rejected for not doing so.” (citation omitted)). At no point in the treatment records did Wang or Feldman indicate a belief that physical therapy, if resumed, would provide Garrison with adequate relief.
     
      
      .In any event, we doubt that epidural steroid shots to the neck and lower back qualify as "conservative” medical treatment.
     
      
      . It is also consistent with the mental health impairments that Garrison described in her testimony' — impairments that undoubtedly interacted with her physical impairments in a manner that makes her testimony even more credible.
     
      
      . See also Hutsell v. Massanari, 259 F.3d 707, 711 (8th Cir.2001) ("With regard to mental disorders, the Commissioner’s decision must take into account evidence indicating that the claimant’s true functional ability may be substantially less than the claimant asserts or wishes. Given the unpredictable course of mental illness, [s]ymptom-ffee intervals and brief remissions are generally of uncertain duration and marked by the impending possibility of relapse. Moreover, [(Individuals with chronic psychotic disorders commonly have their lives structured in such a way as to minimize stress and reduce their signs and symptoms. Such individuals may be much more impaired for work than their signs and symptoms would indicate.” (quotation marks and citations omitted) (alterations in the original)).
     
      
      . See also Scott v. Astrue, 647 F.3d 734, 739-40 (7th Cir.2011) ("There can be a great distance between a patient who responds to treatment and one who is able to enter the workforce, and that difference is borne out in Dr. Tate’s treatment notes. Those notes show that although Scott had improved with treatment, she nevertheless continued to frequently experience bouts of crying and feelings of paranoia. The ALJ was not permitted to "cherry-pick” from those mixed results to support a denial of benefits.... The very nature of bipolar disorder is that people with the disease experience fluctuations in their symptoms, so any single notation that a patient is feeling better or has had a ‘good day’ does not imply that the condition has been treated.” (citations omitted))
     
      
      . The ALJ also erred in concluding that Garrison must be discredited on the ground that some — though not all- — of her bouts of remission appear to have resulted from Garrison going off some of her medications. As we have remarked, “it is a questionable practice to chastise one with a mental impairment for the exercise of poor judgment in seeking rehabilitation.” Nguyen, 100 F.3d at 1465 (quotation marks and citations omitted). In other words, we do not punish the mentally ill for occasionally going off their medication when the record affords compelling reason to view such departures from prescribed treatment as part of claimants’ underlying mental afflictions. See, e.g., Martinez v. Astrue, 630 F.3d 693, 697 (7th Cir.2011); Spiva v. Astrue, 628 F.3d 346, 351 (7th Cir.2010); Pate-Fires v. Astrue, 564 F.3d 935, 945 (8th Cir.2009). Here, the record shows that Garrison’s occasional decisions to go “off her meds” were at least in part a result of her underlying bipolar disorder and her other psychiatric issues.
     
      
      . The Commissioner contends that the credit-as-true rule is invalid. As he concedes, this argument is foreclosed by precedent. On at least one occasion, in fact, we have specifically considered and rejected some of the arguments advanced anew in the Commissioner’s brief. See Moisa v. Barnhart, 367 F.3d 882, 886-87 (9th Cir.2004).
     
      
      . This third requirement naturally incorporates what we have sometimes described as a distinct requirement of the credit-as-true rule, namely that there are no outstanding issues that must be resolved before a determination of disability can be made. See Smolen, 80 F.3d at 1292.
     
      
      . The district court’s error is understandable in light of our prior failure to make clear the relationship between Connett and the Vamey II line of cases. Indeed, several years ago a panel of this Court suggested in dicta that our cases had drifted far enough apart to create an intra-circuit split. Vasquez v. Astrue, 572 F.3d 586, 593 (9th Cir.2009). Following our careful study of the relevant cases, however, we are firmly convinced that they may be fully and fairly reconciled in the manner described herein.
     
      
      . The Commissioner resists this conclusion, arguing that further proceedings are required because the ALJ did not make an RFC determination on the basis of Wang, Anderson, and General’s opinions. Without such an RFC determination, the Commissioner asserts, it would be impossible for us to determine whether Garrison is disabled. This argument is without merit. In no prior credit-as-true case have we suggested that an award of benefits is proper only if the ALJ made a formal RFC finding — and for good reason, because ALJs rarely base their RFC determinations on opinions or testimony that they have rejected (and it will always be such opinions or testimony that are at issue in credit-as-true cases). Instead, we have considered whether the VE answered a question describing a hypothetical person with the RFC that the claimant would possess were the relevant opinion or testimony taken as true. See, e.g., Lingenfelter, 504 F.3d at 1041; Varney II, 859 F.2d at 1401. Here, the ALJ and counsel posed questions to the VE that matched both Garrison’s testimony and the opinions of Wang, Anderson, and General, and in response the VE answered that a person with such an RFC would be unable to work. On that basis, we can conclude that Garrison is disabled without remanding for further proceedings to determine anew her RFC.
     
      
      . At this stage of the credit-as-true analysis, we do not consider arguments against crediting evidence that the ALJ did not make. In other words, as we explained in Harman, we do not consider "whether the ALJ might have articulated a justification for rejecting [a medical] opinion." 211 F.3d at 1179 (emphasis added). This aspect of the credit-as-true rule is grounded in the principles set forth in SEC v. Chenery Corp., 318 U.S. 80, 87-88, 63 S.Ct. 454, 87 L.Ed. 626 (1943).
     
      
      . For example, in urging us to remand for further proceedings, the Commissioner argues that the opinions of Wang, Anderson, and General should be given little weight because some of them are expressed in check-box form. The Commissioner also argues that some periods of improvement by Garrison while she was being treated call into question whether she was disabled. These arguments fail for the same reasons we have already explained in addressing the ALJ’s analysis, most notably that they are utterly belied by the record, inconsistent with our precedent, and contradicted by the opinions of every treating and examining physician.
     
      
      .Although we do so here, we do not mean to suggest that, in every credit-as-true case, courts must undertake an independent review of the entire record.
     
      
      . This conclusion was confirmed by careful questioning of the VE regarding hypothetical persons at the hearing before the ALJ.
     