
    [No. S200923.
    May 29, 2014.]
    SAMUEL DURAN et al., Plaintiffs and Respondents, v. U.S. BANK NATIONAL ASSOCIATION, Defendant and Appellant.
    
      Counsel
    Carlton DiSante & Freudenberger, Carothers DiSante & Freudenberger, Timothy M. Freudenberger, Alison L. Tsao and Kent J. Sprinkle for Defendant and Appellant.
    Horvitz & Levy, Jeremy B. Rosen and Robert H. Wright for Chamber of Commerce of the United States of America and Retail Litigation Center, Inc., as Amici Curiae on behalf of Defendant and Appellant.
    Fred J. Hiestand; Morrison & Foerster, Miriam A. Vogel, Janie F. Schulman and M. Natalie Naugle for California Business Roundtable, Civil Justice Association of California and California Bankers Association as Amici Curiae on behalf of Defendant and Appellant.
    Deborah J. La Fetra and Christina M. Martin for Pacific Legal Foundation as Amicus Curiae on behalf of Defendant and Appellant.
    Altshuler Berzon, Michael Rubin; Jocelyn Larkin, Della Barnett and Michael Caesar for Impact Fund, AARP, Asian Law Caucus, Asian Pacific American Legal Center, Disability Rights Education & Defense Fund, Disability Rights Legal Center, National Consumer Law Center, Public Citizen, Inc., and Public Justice, P.C., as Amici Curiae on behalf of Defendant and Appellant.
    Gibson, Dunn & Crutcher, Theodore J. Boutrous, Jr., Theane Evangelis Kapur and Bradley J. Hamburger for National Association of Security Companies, California Association of Licensed Security Agencies, ABM Security Services Inc., AlliedBarton Security Services, G4S Secure Solutions (USA) Inc., and Securitas Security Services USA, Inc., as Amici Curiae on behalf of Defendant and Appellant.
    Littler Mendelson, Allan G. King, Julie A. Dunne and Margaret Hart Edwards for The Gallup Organization as Amicus Curiae on behalf of Defendant and Appellant.
    
      Paul Hastings, Paul Grossman, Paul W. Cane, Jr., Sean D. Unger and Rishi N. Sharma for California Employment Law Council and Employers Group as Amici Curiae on behalf of Defendant and Appellant.
    Shaw Valenza and D. Gregory Valenza for California Chamber of Commerce as Amicus Curiae on behalf of Defendant and Appellant.
    Law Offices of Ellen Lake, Ellen Lake; Lewis, Feinberg, Lee, Renaker & Jackson, Brad Seligman; Wynne Law Firm, Edward J. Wynne and J.E.B. Pickett for Plaintiffs and Respondents.
    Robbins Geller Rudman & Dowd and Kevin K. Green for Consumer Attorneys of California as Amicus Curiae on behalf of Plaintiffs and Respondents.
    The Kralowec Law Group, Kimberly A. Kralowec; Bryan Schwartz Law, Bryan J. Schwartz; Cohelan Khoury & Singer, Michael D. Singer; Rudy, Exelrod, Zieff & Lowe and Steven G. Zieff for California Employment Lawyers Association as Amicus Curiae on behalf of Plaintiffs and Respondents.
   Opinion

CORRIGAN, J.

We encounter here an exceedingly rare beast: a wage and hour class action that proceeded through trial to verdict. Loan officers for U.S. Bank National Association (USB) sued for unpaid overtime, claiming they had been misclassified as exempt employees under the outside salesperson exemption. (Lab. Code, § 1171.) This exemption applies to employees who spend more than 50 percent of the workday engaged in sales activities outside the office. (Ramirez v. Yosemite Water Co. (1999) 20 Cal.4th 785 [85 Cal.Rptr.2d 844, 978 P.2d 2] (Ramirez).)

After certifying a class of 260 plaintiffs, the trial court devised a plan to determine the extent of USB’s liability to all class members by extrapolating from a random sample. In the first phase of trial, the court heard testimony about the work habits of 21 plaintiffs. USB was not permitted to introduce evidence about the work habits of any plaintiff outside this sample. Nevertheless, based on testimony from the small sample group, the trial court found that the entire class had been misclassified. After the second phase of trial, which focused on testimony from statisticians, the court extrapolated the average amount of overtime reported by the sample group to the class as a whole, resulting in a verdict of approximately $15 million and an average recovery of over $57,000 per person.

As even plaintiffs recognize, this result cannot stand. The judgment must be reversed because the trial court’s flawed implementation of sampling prevented USB from showing that some class members were exempt and entitled to no recovery. A trial plan that relies on statistical sampling must be developed with expert input and must afford the defendant an opportunity to impeach the model or otherwise show its liability is reduced. Statistical sampling may provide an appropriate means of proving liability and damages in some wage and hour class actions. However, as outlined below, the trial court’s particular approach to sampling here was profoundly flawed.

I. BACKGROUND

USB is a nationwide financial services provider. During the relevant period, it operated over 130 branches in California. This class action was brought by USB employees who worked as business banking officers (BBOs). BBOs sell bank products, including loans and lines of credit, to small business customers. Their primary job is to cultivate new business. After a BBO acquires a new client, a client manager handles the portfolio and maintains the relationship. A BBO can be assigned to work with up to four bank branches. Although they typically use one branch office as a home base, some BBOs work from multiple branches or their homes.

A May 1997 job description states that BBOs were expected to develop and manage customer relationships and to “grow[] [USB’s] business through prospecting, networking, cross-selling and relationship management.” Among several other “essential functions,” BBOs were required to “call[] on customers and/or prospects.” They were expected to use a “high degree of creativity and independence in managing account relationships and developing new business.” This job description was essentially unchanged until May 2002, shortly after the complaint here was filed. The new job description splits the list of a BBO’s essential functions into separate categories for “Outside Sales Activity,” “Incidental Activity to Outside Sales,” and “Other Activity,” and specifies that more than 80 percent of a BBO’s time should be spent on “Outside Sales Activity.” During all relevant times, USB has classified the BBO position as exempt from overtime compensation, primarily based on the outside salesperson exemption in Labor Code section 1171.

A. Pretrial Proceedings

On December 26, 2001, a putative class action complaint was filed alleging USB had improperly classified BBOs as exempt, denying them overtime pay in violation of Labor Code section 1194. Class counsel later replaced the original named plaintiff with three new class representatives. In March 2005, when dueling certification motions were pending, counsel replaced these representatives with the two currently named plaintiffs, Samuel (Sam) Duran and Matt Fitzsimmons. All replaced representatives had testified in deposition that they spent more than 50 percent of their workday engaged in sales activities outside USB offices, which would have brought them within the exemption.

1. Initial Class Certification Proceedings

On January 6, 2005, plaintiffs moved to certify the case as a class action. At that time, USB employed approximately 40 BBOs in California. There were over 200 current and former BBOs in the putative class. Plaintiffs provided declarations from 34 current and former BBOs, all averring that they worked overtime hours and spent less than half of their workday engaged in sales-related activities outside their branch office. USB opposed certification. It argued that plaintiffs could not establish a predominance of common issues or that the class action device was superior to other methods of adjudication. USB filed declarations from 83 putative class members, 75 of whom said they typically spent more than 50 percent of their workday engaged in outside sales. USB also submitted deposition testimony from the four former class representatives stating that they regularly worked more than half the day outside the office.

The trial court certified the class. Relying on Sav-On Drug Stores, Inc. v. Superior Court (2004) 34 Cal.4th 319 [17 Cal.Rptr.3d 906, 96 P.3d 194] (Say-On), it found common questions of law and fact predominated over individual issues based on evidence that: (1) the BBO position was “standardized”; (2) USB classified all BBOs as exempt without examining each employee’s duties or work habits; and (3) USB failed to train or monitor BBOs to ensure that exemption requirements were satisfied. The class was ultimately defined as all California-based BBOs who worked overtime for USB at any time during the period from December 26, 1997, until September 26, 2005.

2. Trial Management Plan

About a year after certification, the parties presented competing trial management plans. USB proposed to divide the class into 20 or 30 groups and have special masters conduct individualized evidentiary hearings on liability and damages. Plaintiffs opposed this idea, arguing that USB had no due process right to assert its affirmative defenses against each ■ individual class member.

As an alternative, plaintiffs proposed the use of surveys and random sampling, as described in a declaration from statistics expert Richard Drogin. First, the parties would identify all tasks performed by BBOs and classify which were sales related. Next, the amount of time class members typically spent on outside activities would be assessed using a classwide survey. The parties’ experts would then jointly design a random sample of surveyed class members to proceed through focused discovery and a phase one trial. Finally, aggregate, classwide damages would be determined at a phase two trial. Once an aggregate damages figure was established, the parties would agree upon a claims procedure to distribute damages to individual class members.

USB strenuously objected to the use of representative sampling. If the court rejected its proposal for focused trials of all class members, USB proposed that the parties each select an equal number of class members for the trial sample. USB argued that a survey would not yield a truly representative sample because class members who were properly classified as exempt would have no interest in participating in trial or returning the survey. Thus, any survey-based sample would be skewed in plaintiffs’ favor.

At a case management conference, the court also expressed concern about the potential for biased survey results and proposed an alternative of its own devising. The court suggested that it could select a random sample of 20 class members to testify at trial. Any findings on liability and damages for this sample would then be extrapolated to the remainder of the class. USB again objected that an attempt to extrapolate liability from representative testimony would violate due process. There was no precedent for using random sampling to establish liability in a class action involving the outside sales exemption. Indeed, neither side was aware of any such case even proceeding to a liability phase trial. The premier case approving the use of representative testimony in an overtime class action, Bell v. Farmers Ins. Exchange (2004) 115 Cal.App.4th 715 [9 Cal.Rptr.3d 544] (Bell), concerned the trial of damages only, not liability. Finally, a declaration from USB’s expert, Phillip Gorman, explained that reliance on a small sample would present a high risk of error.

Notwithstanding these objections, the court decided to proceed with its own plan, taking testimony from 20 randomly selected class members in addition to the two named plaintiffs (hereafter, the representative witness group or RWG). The court directed its clerk to draw names “from the proverbial hat" to select 20 class members plus five alternates. The record does not reflect how the court determined that this number or method of selection was appropriate. The court contemplated trying the case in two phases: Phase one would include testimony from those in the RWG. Phase two would consider evidence, including expert testimony, “seeking to extrapolate the results of Phase I evidence to the class.”

3. Additional Pretrial Proceedings

a. Dismissal of Legal Claims and Opt-out Proceedings

In November 2006, around the same time the trial court finalized the trial management plan and selected the RWG, plaintiffs moved to dismiss their claims under the Labor Code and proceed solely on a claim for equitable relief under the unfair competition law (UCL). (Bus. & Prof. Code, § 17200 et seq.) The court allowed the amendment but also ordered that class members be notified and given a second opportunity to opt out. USB objected that the randomness of the sample would be compromised if members of the RWG withdrew. In total, nine people opted out: Four were members of the RWG, and five were among the remaining 250 people in the class.

USB then asked the court to readmit the four RWG members. It produced declarations from two RWG members stating they had opted out at class counsel’s urging. These declarants believed they had been properly classified as exempt and felt the lawsuit was frivolous. In addition, a declaration from USB’s expert Gorman explained that the much higher opt-out rate for RWG members (20 percent, as compared with 2 percent for the rest of the class) was statistically “very unlikely to be attributable to random chance.” In his opinion, removal of the four RWG members created a biased sample that, if extrapolated, could result in large overestimates of the percentage of misclassified class members and of any overtime pay owed to the class. The court denied the motion, observing that questions about the admissibility of testimony from non-RWG witnesses would be more appropriately addressed at trial. The final class was composed of 260 individuals.

b. USB’s First Decertification Motion

In March 2007, USB moved to decertify the class action. Citing new case law and deposition testimony from RWG members, it argued that individual issues predominated. USB also submitted a declaration from statistician Andrew Hildreth opining that the RWG sample size was too small to produce a reasonably accurate estimate of classwide liability or damages. Hildreth explained that a high margin of error was inherent in such a small sample size. As a result, it was very likely that a classwide judgment would encompass some employees who were properly classified, and the damages estimate extrapolated from the small sample would be highly inaccurate. The motion was denied.

B. Trial

1. Phase One: Liability

There were many in limine motions, but one was particularly significant. USB sought to introduce declarations and live testimony from class members outside the RWG. These included formerly named plaintiffs, class members who had previously given depositions supporting USB, and individuals who had opted out of the class. The court substantially denied these requests, ruling: “Defendant may not introduce testimony[,] evidence^] or argument related to BBOs who were not selected in the RWG and/or were not supervising sales managers of the RWG members where the purpose of such testimony or evidence is to impact the data or analyses on the ultimate question of liability or damages. To the extent such witnesses are shown to have admissible percipient witness or impeach[ment] testimony as to RWG members, the testimony may be permitted by [the] Court after determination of objections thereto.” Throughout the trial, the court refused to hear any testimony about the work habits of BBOs not included in the RWG.

Phase one of the bench trial lasted 40 court days. The two named plaintiffs and 19 of the 20 other RWG members testified. USB called several corporate witnesses and the direct supervisors of some of the RWG witnesses.

a. Summary of Evidence Presented

All RWG witnesses worked exclusively on sales. All set their own schedules, deciding when and where they worked. They consistently testified that USB never told them where to work, or that they were required to spend more than half of their work time outside a branch office. USB kept no records of BBOs’ working hours or the proportion of time spent either in or outside bank offices. The RWG witnesses all testified that they generally spent more than half of their workday inside bank offices.

The RWG testimony varied somewhat, however, on the subject of overtime work. Some RWG witnesses testified that they typically worked no more than 40 hours per week. Some testified to relatively small amounts of overtime, reporting workweeks of up to 45 hours. Others reported working more overtime. One RWG member, Chad Penza, was something of an anomaly. At one point during the three and a half years he worked for USB, Penza was the top-producing BBO nationwide. He initially worked 10 and a half hours a day, but after a few months he began working 12 to 13 hours a day, including several hours on weekends. Penza explained that he chose to work long hours because he was trying to reach sales goals and succeed. Other RWG witnesses similarly testified that their work schedules and habits were motivated by the desire to meet sales goals and not by any expectation from the bank that they work overtime.

Some of the RWG members were impeached with contrary declarations they had previously signed. For example, top producer Penza executed two declarations, in 2002 and 2004, stating that he spent from 75 to 100 percent of his time making outside sales calls. Another BBO, Adney Koga, signed a declaration and testified in deposition that he typically spent 55 percent of his time away from the office on sales calls.

USB called several witnesses. Ted Biggs, the western regional manager for USB’s small business group, testified that BBOs were expected to spend the majority of their time making outside sales calls, networking, and visiting customers’ businesses. Biggs encouraged all his BBOs to meet sales goals using a “15-3-1-1” model. According to this model, a BBO who makes an average of 15 customer contacts a week should obtain three loan applications, which will normally yield one loan approval and one funded loan. Given the length of the average sales call, Biggs estimated that BBOs should spend up to 30 hours a week meeting with potential customers at the clients’ businesses. Other USB managers testified that they routinely counseled BBOs to spend a majority of their time meeting with customers outside bank locations.

Managers also described their supervision of those in the RWG. District manager Michael Lewis supervised RWG member Matthew Gediman for over a year. He estimated that Gediman spent from 55 to 70 percent of his time outside the office during the first six months. Later, when Gediman’s production waned, Lewis encouraged him to spend more time on outside sales. Similarly, sales manager Pat Collins testified that she hired class representative Sam Duran and told him he was expected to spend the majority of his time on outside sales calls. When Duran failed to meet sales goals, she encouraged him to follow the 15-3-1-1 model and make more outside calls.

b. Posttrial Motions

At the close of evidence in phase one, USB filed a due process motion seeking to introduce deposition excerpts and over 70 declarations from class members outside the RWG. The court barred this evidence as inconsistent with its selected trial plan. The court also denied USB’s motion for judgment under Code of Civil Procedure section 631.8.

In anticipation of phase two, plaintiffs moved to amend the declaration of their expert, Jon Krosnick, to permit trial testimony about the results of a telephone survey Krosnick had conducted of class members’ work hours. The court allowed the amendment, and USB moved to exclude the survey evidence. In opposition, plaintiffs filed a declaration from their statistics expert, Richard Drogin. Drogin opined that phase one findings of liability and average weekly hours of unpaid overtime could be “reliably projected to the whole class” because they were based on a random sample. Taking the court’s indicated findings for phase one, with adjustments for vacation time and other breaks in service, Drogin calculated a weighted average of overtime for the RWG at 11.87 hours per week, with a margin of error of plus or minus 5.14 hours at a 95 percent confidence interval. The relative margin of error for the overtime estimate was plus or minus 43.3 percent. Although this margin of error based on the RWG sample alone was quite large, Drogin observed that Krosnick’s survey had actually produced a higher estimate of weekly overtime (14.39 hours per week). Drogin asserted that Krosnick’s study corroborated the accuracy of the RWG estimate. Nevertheless, the court ruled Krosnick’s survey evidence inadmissible unless it became relevant for impeachment.

Shortly before the formal statement of decision was issued in phase one, USB moved again to decertify the class. USB argued that because trial evidence revealed wide variations among class members, individual issues predominated as to both liability and restitution. The motion was denied. The court decided to extend liability findings for RWG members to the class as a whole. It dismissed as premature USB’s objection to the calculation of restitution by extrapolation from phase one evidence.

c. Statement of Decision

The trial court issued a phase one statement of decision on September 22, 2008, approximately a year after the close of evidence. It found that, during all relevant times, USB did not have a policy requiring BBOs to spend more than half their time away from bank locations. Although some defense witnesses testified the bank expected BBOs to spend most of their time away from the office, the court discredited this testimony based on its assessment of the evidence and the lack of documentary support. The court found that BBOs were never told they were expected to spend time outside the bank. USB did not track the time BBOs worked inside or outside of bank offices. Consistent with their classification as exempt employees, the bank kept no record of BBOs’ work hours. USB had no compliance program to ensure BBOs were properly classified, and no BBO had ever been disciplined for spending excessive time inside the bank. The trial court also found that it was unrealistic for USB to expect BBOs to work more than half their time outside the bank because many BBO job duties could only be performed, or could most easily be performed, inside bank offices.

The court also found that the practice of working more than half the time inside the bank did not diverge from USB’s realistic expectations for work performance because USB had no expectations concerning where the work would be performed. “[T]he only expectation [USB] had for its BBOs was that they hit their production goals.” BBOs were evaluated, ranked, compensated, rewarded, and disciplined based solely on their sales production. Where BBOs worked, or even how they performed their job, did not matter to USB. The bank cared only whether BBOs were generating and retaining business.

Accordingly, the court concluded USB did not carry its burden of proof on the outside salesperson exemption. Based primarily on testimony from RWG witnesses, the court ruled that the entire class of BBOs employed by USB between December 26, 1997, and September 26, 2005, was misclassified as exempt, and all class members were owed overtime in amounts to be determined in phase two of the trial. The court provisionally accepted Drogin’s assertion that RWG members worked an average of 11.87 hours of overtime per week but deferred consideration of the number’s significance to phase two.

2. Phase Two: Restitution

a. Pretrial Motions

Before the start of evidence in phase two, plaintiffs filed an in limine motion to prevent USB from introducing any evidence pertaining to liability because that question had been resolved in the court’s statement of decision for phase one. The court granted the motion, noting that the purpose of phase one had been to resolve USB’s liability for misclassification. The court once again denied USB’s request to introduce declarations and deposition testimony from non-RWG class members. The court thus barred any challenge to its phase one decision that all class members were misclassified as exempt and all were entitled to overtime compensation.

b. Evidence

Plaintiffs’ statistics expert Richard Drogin testified that the trial court’s methodology in phase one was statistically sound. Drogin conceded, however, that the plan differed from his own proposal. Drogin had suggested that the entire class be surveyed as to how much time each member spent on outside sales. No such survey was ever conducted. Nor was the sample group entirely random. It included the two named plaintiffs, who had been chosen by class counsel to replace four apparently less satisfactory representatives. Nevertheless, Drogin believed their inclusion did not skew the sample in favor of the class because his calculations produced a higher average weekly overtime number when the named plaintiffs were excluded. Drogin testified that the trial court’s phase one findings on liability could be extrapolated to the class with a 13 percent margin of error. In other words, based on the court’s finding that all RWG witnesses were misclassified, Drogin determined that at least 87 percent of the class was misclassified. Under Drogin’s own calculations, then, based on extrapolation from a small and not entirely random sample, up to 13 percent of the class was properly classified as exempt. This conclusion stood in contrast to the trial court’s determination that the entire class was misclassified.

As to restitution, Drogin testified that RWG members reported working an average of 11.87 hours of overtime per week. He arrived at this figure by adding the number of overtime hours the court found had been worked by the 21 testifying witnesses and dividing that total by the number of weeks they had worked. If a witness reported a range of overtime hours, Drogin picked the midpoint. He calculated that the margin of error for this figure was plus or minus 5.14 hours per week, or 43.3 percent.

USB’s statistics expert, Andrew Hildreth, identified several problems with the trial court’s sampling plan. He explained that simply drawing a random sample is not sufficient to produce an unbiased and accurate estimate about an underlying population. To be reliable, the sample must be sufficiently large and free from bias caused by various sampling errors. Here, the sample size was too small. Hildreth explained that, before a sample is selected, a pilot study is typically done to determine the amount of variation in the underlying population. Based on this pilot study, experts can estimate the standard deviation in the population and then, using the desired margin of error, calculate the optimal sample size. Although both sides’ experts had proposed such a study, none was done before the court decided to pick 20 class members for the sample.

Hildreth explained that a “non-response” error occurred when a member of the RWG, Borsay Bryant, failed to appear and testify at trial. The court responded to Bryant’s absence by eliminating him from the sample and treating him like an absent class member. However, when members of a sample group leave or refuse to participate for reasons relevant to the matter under consideration, the sample participants who choose to remain may not accurately reflect the underlying population.

In Hildreth’s opinion, the sample was also seriously marred by selection bias. When the court allowed a second round of opt-outs after random selection of the RWG, it effectively created two groups with different motivations. If RWG members opted out, they would no longer have to testify at trial, and their testimony would no longer influence the sample results. However, if class members outside the RWG opted out, this choice would not change their participation in the trial nor affect the sample results. The two groups did, in fact, behave differently; Less than 2 percent of the nonsample group opted out, whereas 20 percent of the sample group did so. Hildreth explained that the second round of opt-outs gave members of the RWG an opportunity to self-select into the sample group, compromising the randomness of the sample. In Hildreth’s opinion, the court compounded the selection bias problem when it refused to hear testimony from plaintiffs who had opted out of the RWG. Inclusion of the two named plaintiffs in the sample group also created selection bias. These plaintiffs were selected by class counsel, not a random draw. This meant approximately 10 percent of the already small sample group had been selected by interested parties. Including nonrandom plaintiffs in the sample group had an obvious potential to bias the results.

Hildreth also opined that there was no statistical basis to conclude from the court’s phase one findings that 100 percent of the class was misclassified. Even if all the sampling errors he identified could be ignored, and all those in the random sample were correctly found to be misclassified, it was still statistically possible that 13 percent of the class was properly classified as exempt. For a sample of 19, Hildreth calculated that up to 14 percent of the class, or 36 members, could have been properly classified.

c. Statement of Decision

C. Posttrial Proceedings

In a motion for new trial, USB argued it was denied due process by the court’s refusal to admit non-RWG class member declarations or depositions and its refusal to hear non-RWG testimony offered in USB’s defense. The motion was denied.

On appeal, the judgment was unanimously reversed The Court of Appeal held the trial plan’s reliance on representative sampling to determine liability denied USB its due process right to litigate affirmative defenses. Due process concerns were also implicated by the high margin of error. Finally, the court concluded the trial court had abused its discretion in denying USB’s second motion to decertify the class. Even if certification had once appeared appropriate, it should have been apparent after phase one that individual issues predominated to such an extent that they rendered class treatment impossible. In addition to reversing the trial court’s judgment, the Court of Appeal ordered the class decertified. We granted review.

II. DISCUSSION

During the past decade, California courts have seen an increasing number of class action lawsuits alleging workers were wrongly classified as exempt from overtime laws and other labor regulations. Employers often treat all workers within a job position as either exempt or nonexempt. In actuality, however, Labor Code exemptions frequently depend on how individual employees perform their jobs. When an exemption defense turns on such individualized issues, questions about how, or whether, the case can proceed as a class action become particularly thorny.

Faced with the potential difficulties of managing individual issues in misclassification cases, many trial courts have denied certification or decertified the class before trial. (See post, at pp. 29-31.) Under deferential appellate review for abuse of discretion (see Linder v. Thrifty Oil Co. (2000) 23 Cal.4th 429, 435-436 [97 Cal.Rptr.2d 179, 2 P.3d 27]), such decisions have been routinely upheld. Conversely, other trial courts have granted certification in misclassification actions, and these decisions, too, have been upheld. (See, e.g., Sav-On, supra, 34 Cal.4th 319; Bell, supra, 115 Cal.App.4th 715; see also post, at p. 31 & fn. 28.) As far as we are aware, however, this is only the second misclassification case in California certified as a class action and tried to verdict.

This appeal highlights difficult questions about how individual issues can be successfully managed in a complex class action. After reviewing the requirements of the outside salesperson exemption, we discuss the trial court’s obligation to consider the manageability of individual issues in certifying a class action. In particular, we hold that a class action trial management plan must permit the litigation of relevant affirmative defenses, even when these defenses turn on individual questions. Next, we explain how the trial court ignored individual issues here, hamstringing USB’s ability to defend itself. Finally, we describe the flaws in the trial plan’s implementation of statistical sampling as proof of USB’s liability to the class.

A. The Outside Salesperson Exemption

USB’s primary defense was that plaintiffs were exempt from overtime laws because they were outside salespeople. Labor Code section 1171 provides that the overtime pay requirements of Labor Code section 1194 apply to those “employed in any occupation, trade, or industry, whether compensation is measured by time, piece, or otherwise, but shall not include any individual employed as an outside salesman . . . .” (Italics added.) The applicable wage order also states that its provisions “shall not apply to outside salespersons.” (Industrial Welfare Com., wage order No. 4-2001 (hereafter Wage Order No. 4-2001); Cal. Code Regs., tit. 8, § 11040, subd. 1(C).) An “ ‘[o]utside salesperson’ ” is one “who customarily and regularly works more than half the working time away from the employer’s place of business selling tangible or intangible items or obtaining orders or contracts for products, services or use of facilities.” (Wage Order No. 4-2001, subd. 2(M).) The employer bears the burden of proving that the outside salesperson exemption applies. (Ramirez, supra, 20 Cal.4th at pp. 794-795.)

Ramirez also involved the outside salesperson exemption. (Ramirez, supra, 20 Cal.4th 785.) Unlike the corresponding federal provision, California’s wage order definition “takes a purely quantitative approach” and focuses exclusively on whether the employee spends more than half of the workday engaged in sales activities outside the office. (Id. at p. 797.) The exemption requires scrutiny of both the job description and an employee’s own work habits. (Id. at pp. 801-802.) The trial court must inquire “first and foremost, how the employee actually spends his or her time.” (Id. at p. 802, italics added.) Ancillary questions include “whether the employee’s practice diverges from the employer’s realistic expectations, whether there was any concrete expression of employer displeasure over an employee’s substandard performance, and whether these expressions were themselves realistic given the actual overall requirements of the job.” (Ibid.)

The Ramirez dispute centered on whether the plaintiff spent more than half his working time engaged in sales. (Ramirez, supra, 20 Cal.4th at pp. 802-803; see Walsh v. IKON Office Solutions, Inc., supra, 148 Cal.App.4th at pp. 1445-1446.) Here, the parties agree that BBOs spent most or all of their workday in that fashion. The dispute concerns where they typically did the work. As the courts below recognized, the wage order’s approach to this question is just as quantitative as it was in Ramirez. For the exemption to apply, a BBO must “customarily and regularly work[] more than half the working time away from the employer’s place of business selling ... or obtaining orders or contracts . . . .” (Wage Order No. 4-2001, subd. 2(M), italics added.)

We have observed that some common questions about the exemption “are likely to prove susceptible of common proof’ in a class action. (Say-On, supra, 34 Cal.4th at p. 337.) Job requirements and employer expectations of how duties are to be performed may often be established by evidence relating to a group as a whole. (Ramirez, supra, 20 Cal.4th at p. 802.) But litigation of the outside salesperson exemption has the obvious potential to generate individual issues because the primary considerations are how and where the employee actually spends his or her workday. (Sav-On, at pp. 336-337; Ramirez, at p. 802.) Of course, the questions of actual performance and employer expectations can be intertwined. For example, evidence that most members of a company’s sales force actually spend the majority of their time working in the office might be relevant to show that the employer’s expectations regarding outside sales work were unreasonable. Yet, as noted, the question is “first and foremost” how the employee’s time is actually spent. (Ramirez, at p. 802.) Given California’s uniquely quantitative approach to this exemption (see id. at p. 801), some proof about how individual employees use their time will often be necessary to. accurately determine an employer’s overtime liability.

B. Certification of Misclassification Class Actions

Although putative class actions alleging misclassification are increasingly common, these cases are only rarely tried to verdict. Settlement should never be treated as a foregone conclusion, however. In the misclassification context, as in other types of cases, trial courts deciding whether to certify a class must consider not just whether common questions exist, but also whether it will be feasible to try the case as a class action. Depending on the nature of the claimed exemption and the facts of a particular case, a misclassification claim has the potential to raise numerous individual questions that may be difficult, or even impossible, to litigate on a classwide basis. Class certification is appropriate only if these individual questions can be managed with an appropriate trial plan.

1. Class Certification Principles

a. Predominance of Common Issues

A class action may be maintained if there is “an ascertainable class and a well-defined community of interest among the class members.” (Washington Mutual Bank v. Superior Court (2001) 24 Cal.4th 906, 913 [103 Cal.Rptr.2d 320, 15 P.3d 1071] (Washington Mutual); see Code Civ. Proc., § 382.) As part of the community of interest requirement, the party seeking certification must show that issues of law or fact common to the class predominate. (Richmond v. Dart Industries, Inc. (1981) 29 Cal.3d 462, 470 [174 Cal.Rptr. 515, 629 P.2d 23].)

We have observed that the “ultimate question” for predominance is whether “the issues which may be jointly tried, when compared with those requiring separate adjudication, are so numerous or substantial that the maintenance of a class action would be advantageous to the judicial process and to the litigants.” (Collins v. Rocha (1972) 7 Cal.3d 232, 238 [102 Cal.Rptr. 1, 497 P.2d 225]; see Lockheed Martin Corp. v. Superior Court (2003) 29 Cal.4th 1096, 1104-1105, 1108 [131 Cal.Rptr.2d 1, 63 P.3d 913].) “The answer hinges on ‘whether the theory of recovery advanced by the proponents of certification is, as an analytical matter, likely to prove amenable to class treatment.’ (Sav-On, [supra, 34 Cal.4th] at p. 327.) . . . ‘As a general rule if the defendant’s liability can be determined by facts common to all members of the class, a class will be certified even if the members must individually prove their damages.’ [Citations.]” (Brinker Restaurant Corp. v. Superior Court (2012) 53 Cal.4th 1004, 1021-1022 [139 Cal.Rptr.3d 315, 273 P.3d 513] (Brinker); see Employment Development Dept. v. Superior Court (1981) 30 Cal.3d 256, 266 [178 Cal.Rptr. 612, 636 P.2d 575]; Vasquez v. Superior Court (1971) 4 Cal.3d 800, 809, 815 [94 Cal.Rptr. 796, 484 P.2d 964].) However, we have cautioned that class treatment is not appropriate “if every member of the alleged class would be required to litigate numerous and substantial questions determining his individual right to recover following the ‘class judgment’ ” on common issues. (City of San Jose v. Superior Court (1974) 12 Cal.3d 447, 459 [115 Cal.Rptr. 797, 525 P.2d 701].)

The granting of class certification thus requires a determination that group, rather than individual, issues predominate. Such a finding, however, does not preclude the consideration of individual issues at trial when those issues legitimately touch upon relevant aspects of the case being litigated.

b. Manageability of Individual Issues

Although predominance of common issues is often a major factor in a certification analysis, it is not the only consideration. In certifying a class action, the court must also conclude that litigation of individual issues, including those arising from affirmative defenses, can be managed fairly and efficiently. (Washington Mutual, supra, 24 Cal.4th at pp. 922-923.) “[W]hether in a given case affirmative defenses should lead a court to approve or reject certification will hinge on the manageability of any individual issues. [Citation.]” (Brinker, supra, 53 Cal.4th at p. 1054 (cone, opn. of Werdegar, J.).) In wage and hour cases where a party seeks class certification based on allegations that the employer consistently imposed a uniform policy or de facto practice on class members, the party must still demonstrate that the illegal effects of this conduct can be proven efficiently and manageably within a class setting. (Brinker, at p. 1033; Dailey v. Sears, Roebuck & Co. (2013) 214 Cal.App.4th 974, 989 [154 Cal.Rptr.3d 480].)

After a class has been certified, the court’s obligation to manage individual issues does not disappear. “[0]nce the issues common to the class have been tried, and assuming some individual issues remain, each plaintiff must still by some means prove up his or her claim, allowing the defendant an opportunity to contest each individual claim on any ground not resolved in the trial of common issues.” (Johnson v. Ford Motor Co. (2005) 35 Cal.4th 1191, 1210 [29 Cal.Rptr.3d 401, 113 P.3d 82].) In Sav-On, supra, 34 Cal.4th at page 332, we upheld the certification of an overtime class action even though the defendant complained that calculation of each class member’s recovery would likely “ ‘degenerate into a multitude of mini-trials.’ ” There, we found substantial evidence of common issues based on class members’ allegations that they were all required to work overtime and perform nonexempt tasks pursuant to uniform company policies and practices. (Id. at pp. 327-328.) We upheld the trial court’s certification order even as we acknowledged that individualized proof of class members’ nonexempt status and overtime amounts might ultimately be required. (Id. at pp. 332-334.) In so doing, we stressed that “[individual issues do not render class certification inappropriate so long as such issues may effectively be managed.” (Id. at p. 334, italics added.)

Trial courts must pay careful attention to manageability when deciding whether to certify a class action. In considering whether a class action is a superior device for resolving a controversy, the manageability of individual issues is just as important as the existence of common questions uniting the proposed class. If the court makes a reasoned, informed decision about manageability at the certification stage, the litigants can plan accordingly and the court will have less need to intervene later to control the proceedings.

Trial courts also have the obligation to decertify a class action if individual issues prove unmanageable. (Sav-On, supra, 34 Cal.4th at p. 335; Washington Mutual, supra, 24 Cal.4th at p. 927.) In the context of overtime class actions, some courts have decertified when individual issues related to an exemption defense threaten to overwhelm the litigation. For example, in Walsh v. IKON Office Solutions, Inc., supra, 148 Cal.App.4th at pages 1445-1448, the court certified an overtime class action involving the outside salesperson exemption. It later decertified the class when discovery revealed that the circumstances of each class member’s employment differed significantly. The Court of Appeal affirmed this ruling, noting that differences in time spent on sales activities and work outside the office meant that adjudication of the exemption would require individual hearings on liability and damages. (Id. at p. 1456.) Similarly, Keller v. Tuesday Morning, Inc. (2009) 179 Cal.App.4th 1389 [102 Cal.Rptr.3d 498] upheld the decertification of an overtime class action brought by retail store managers. Although the trial court initially certified a class based on our opinion in Sav-On, two years later it determined that individual inquiries concerning how managers spent their time would overwhelm the issues susceptible to classwide proof. (Keller, at p. 1399.)

2. Management of Individual Issues in Misclassification Class Actions

Employers in misclassification cases typically argue their exemption defense raises issues unique to each individual class member. As a result, misclassification class actions can pose difficult manageability challenges.

In her concurring opinion in Brinker, Justice Werdegar drew an instructive distinction between the types of affirmative defenses that can undermine manageability: “For purposes of class action manageability, a defense that hinges liability vel non on consideration of numerous intricately detailed factual questions, as is sometimes the case in misclassification suits, is different from a defense that raises only one or a few questions and that operates not to extinguish the defendant’s liability but only to diminish the amount of a given plaintiff’s recovery.” (Brinker, supra, 53 Cal.4th at p. 1054 (cone. opn. of Werdegar, J.), fn. omitted.) Defenses that raise individual questions about the calculation of damages generally do not defeat certification. (Sav-On, supra, 34 Cal.4th at p. 334.) However, a defense in which liability itself is predicated on factual questions specific to individual claimants poses a much greater challenge to manageability. This distinction is important. As we observed in City of San Jose v. Superior Court, supra, 12 Cal.3d at page 463: “Only in an extraordinary situation would a class action be justified where, subsequent to the class judgment, the members would be required to individually prove not only damages but also liability.”

Unless an employer’s uniform policy or consistent practice violates wage and hour laws (see, e.g., Brinker, supra, 53 Cal.4th at p. 1033),

California courts have been reluctant to certify class actions alleging misclassification. (E.g., Arenas v. El Torito Restaurants, Inc. (2010) 183 Cal.App.4th 723, 734 [108 Cal.Rptr.3d 15]; Dunbar v. Albertson’s, Inc., supra, 141 Cal.App.4th 1422, 1431; see Soderstedt v. CBIZ Southern California, LLC (2011) 197 Cal.App.4th 133, 153-154 [127 Cal.Rptr.3d 394] [certification denied, despite employer’s uniform policies, due to variations in how the policies were implemented with different employees].)

However, individual issues will not necessarily overwhelm common issues when a case involves exemptions premised on how employees spend the workday. In Sav-On, supra, 34 Cal.4th 319, for example, we upheld certification of an overtime class action based on a showing that all plaintiffs performed jobs that were highly standardized. As a result, class members performed essentially the same tasks, most of which were nonexempt as a matter of law. (Id. at pp. 327-328.) Further, the defendant’s corporate policy required all class members to work overtime. (Id. at p. 327.) Where standardized job duties or other policies result in employees uniformly spending most of their time on nonexempt work, class treatment may be appropriate even if the case involves an exemption that typically entails fact-specific individual inquiries.

Moreover, if sufficient common questions exist to support class certification, it may be possible to manage individual issues through the use of surveys and statistical sampling. Statistical methods cannot entirely substitute for common proof, however. There must be some glue that binds class members together apart from statistical evidence. While sampling may furnish indications of an employer’s centralized practices (see Sav-On, supra, 34 Cal.4th at p. 333), no court has “deemed a mere proposal for statistical sampling to be an adequate evidentiary substitute for demonstrating the requisite commonality, or suggested that statistical sampling may be used to manufacture predominate common issues where the factual record indicates none exist” (Dailey v. Sears, Roebuck & Co., supra, 214 Cal.App.4th at p. 998). In addition, as we will discuss, a statistical plan for managing individual issues must be conducted with sufficient rigor.

If statistical evidence will comprise part of the proof on class action claims, the court should consider at the certification stage whether a trial plan has been developed to address its use. A trial plan describing the statistical proof a party anticipates will weigh in favor of granting class certification if it shows how individual issues can be managed at trial. Rather than accepting assurances that a statistical plan will eventually be developed, trial courts would be well advised to obtain such a plan before deciding to certify a class action. In any event, decertification must be ordered whenever a trial plan proves unworkable.

3. Trial Plan Did Not Manage Individual Issues Arising from USB’s Exemption Defense

Here, the trial court found a predominance of common questions based on (1) standardization of the BBO position, (2) USB’s classification of all BBOs as exempt, without inquiry into their work habits, and (3) USB’s failure to train or monitor BBOs to ensure compliance with the exemption. The primary consideration in a misclassification case pertains to “the realistic requirements of the job.” (Ramirez, supra, 20 Cal.4th at p. 802.) The trial court ultimately made detailed findings to the effect that the BBO position was essentially a telemarketing job, most easily performed in the office. However, at the certification stage, it should have been apparent that litigation of the outside salesperson defense would also involve significant inquiry into how each of the class’s 260 members “actually spen[t] his or her time.” {Ibid.)

Evidence presented in connection with the certification motions showed that BBOs enjoyed exceptional independence. The bank did not tell them when, where, or how to do their work. It focused exclusively on whether BBOs achieved their sales targets. Consistent with this independence, declarations offered by both sides showed significant variation in the time individual BBOs worked outside the office. The trial court was of course entitled to discredit USB’s declarations. However, it received no evidence establishing uniformity in how BBOs spent their time. (See Ramirez, supra, 20 Cal.4th at p. 802.) Wide variation among class members is a factor informing whether the exemption question can be resolved by a simple “yes” or “no” answer for the entire class.

Thus, USB’s exemption defense raised a host of individual issues. While common issues among class members may have been sufficient to satisfy the predominance prong for certification, the trial court also had to determine that these individual issues could be effectively managed in the ensuing litigation. (See Brinker, supra, 53 Cal.4th at p. 1054 (cone. opn. of Werdegar, J.); Sav-On, supra, 34 Cal.4th at p. 334.) Here, the certification order was necessarily provisional in that it was subject to development of a trial plan that would manage the individual issues surrounding the outside salesperson exemption.

In general, when a trial plan incorporates representative testimony and random sampling, a preliminary assessment should be done to determine the level of variability in the class. (See post, at p. 42.) If the variability is too great, individual issues are more likely to swamp common ones and render the class action unmanageable. No such assessment was done here. With no sensitivity to variability in the class, the court forced the case through trial with a flawed statistical plan that did not manage but instead ignored individual issues.

This result cannot stand. Although courts enjoy great latitude in structuring trials, and we have encouraged the use of innovative procedures, any trial must allow for the litigation of affirmative defenses, even in a class action case where the defense touches upon individual issues. As we will explain, the trial plan here unreasonably prevented USB from supporting its affirmative defense. Accordingly, the class judgment must be reversed. The trial court is of course free to entertain a new certification motion on remand, but if it decides to proceed with a class action it must apply the guidelines set out here.

C. Class Action Trial Must Permit Adjudication of Affirmative Defenses

We have encouraged trial courts to be “procedurally innovative” in managing class actions. (City of San Jose v. Superior Court, supra, 12 Cal.3d at p. 453.) We have remained open to the appropriate use of representative testimony, sampling, or other procedures employing statistical methodology. (See Sav-On, supra, 34 Cal.4th at pp. 339-340.) However, the trial plan here was seriously flawed. First, without following a valid statistical model developed by experts, the court improperly extrapolated liability findings from a small, skewed sample group to the entire class. Second, in pursuing this extrapolation, the court adamantly refused to admit relevant evidence relating to BBOs outside the sample group. These rulings significantly impaired USB’s ability to present a defense.

Although the trial court’s certification decision was apparently influenced by Sav-On, supra, 34 Cal.4th 319, the court overlooked our advisements about the need to manage individual issues in a class action. Although we found substantial evidence of common issues supporting certification in that misclassification case, we also articulated an important caveat: “Unquestionably, . . . defendant is entitled to defend against plaintiffs’ complaint by attempting to demonstrate wide variations in the types of stores and, consequently, in the types of activities and amounts of time per workweek the [class members] in those stores spent on different types of activities.” (Id. at pp. 329-330.) In rigidly adhering to its flawed trial plan and excluding relevant evidence central to the defense, the court here did not manage individual issues. It ignored them.

We have long observed that the class action procedural device may not be used to abridge a party’s substantive rights. “Class actions are provided only as a means to enforce substantive law. Altering the substantive law to accommodate procedure would be to confuse the means with the ends — to sacrifice the goal for the going.” (City of San Jose v. Superior Court, supra, 12 Cal.3d at p. 462.) While class action defendants may not have an unfettered right to present individualized evidence in support of a defense, our precedents make clear that a class action trial management plan may not foreclose the litigation of relevant affirmative defenses, even when these defenses turn on individual questions.

For example, in Granberry v. Islay Investments (1995) 9 Cal.4th 738, 742-743 [38 Cal.Rptr.2d 650, 889 P.2d 970], we held that a landlord sued for excess rent by a class of former tenants was entitled to set off amounts the tenants owed for unpaid rent, repairs, and cleaning. (Id. at p. 743.) The tenants objected that “to allow setoff would be inappropriate in class actions . . . because of numerous practical difficulties,” such as the need for setoff amounts to be litigated individually in thousands of cases. (Id. at p. 749.) In rejecting the plaintiffs’ argument that these difficulties should bar the landlord from raising the setoff defense, we stressed that “it is inappropriate to deprive defendants of their substantive rights merely because those rights are inconvenient in light of the litigation posture plaintiffs have chosen.” (Ibid.)

We voiced similar concerns in Washington Mutual, supra, 24 Cal.4th 906. There, the trial court had certified a nationwide class action challenging a bank’s practice of forcing homebuyers to purchase expensive replacement policies when the hazard insurance on their properties lapsed. (Id. at p. 912.) The bank argued common questions did not predominate because choice-of-law provisions in the loan documents would require the application of different state laws to different plaintiffs’ claims. (Id. at p. 913.) We held that, when deciding whether to certify a nationwide class, trial courts must consider the potential for individual issues arising from choice-of-law clauses. (Id. at p. 926.) Although the Court of Appeal had suggested businesses should not be allowed to rely on choice-of-law clauses to escape involvement in a nationwide class action, we disagreed. (Id. at p. 918.) Instead, stressing that class action procedure must conform to substantive law, not vice versa, we explained that “an otherwise enforceable choice-of-law agreement may not be disregarded merely because it may hinder the prosecution of a multistate or nationwide class action . . . .” (Ibid.)

Similarly here, the trial court could not abridge USB’s presentation of an exemption defense simply because that defense was cumbersome to litigate in a class action. Under Code of Civil Procedure section 382, just as under the federal rules, “a class cannot be certified on the premise that [the defendant] will not be entitled to litigate its statutory defenses to individual claims.” (Wal-Mart Stores, Inc. v. Dukes (2011) 564 U.S._,_[180 L.Ed.2d 374, 131 S.Ct. 2541, 2561].) These principles derive from both class action rules and principles of due process. (See Lindsey v. Normet (1972) 405 U.S. 56, 66 [31 L.Ed.2d 36, 92 S.Ct. 862]; Philip Morris USA v. Williams (2007) 549 U.S. 346, 353 [166 L.Ed.2d 940, 127 S.Ct. 1057].)

The court’s decision to extrapolate classwide liability from a small sample, and its refusal to permit any inquiries or evidence about the work habits of BBOs outside the sample group, deprived USB of the ability to litigate its exemption defense. USB repeatedly submitted sworn declarations from 75 class members stating that they worked more than half their time outside the office. This evidence suggested that work habits among BBOs were not uniform and that nearly one-third of the class may have been properly classified as exempt and lacking any valid claim against USB. USB also sought to introduce live testimony from witnesses about their work outside the office as BBOs. Yet the court refused to admit any of this evidence or allow it to be considered by experts as part of a statistical sampling model. Instead, extrapolating findings from its small sample and ignoring all evidence proffered to impeach these findings, the court found that the entire class was misclassified. The injustice of this result is manifest. While representative testimony and sampling may sometimes be appropriate tools for managing individual issues in a class action, these statistical methods cannot so completely undermine a defendant’s right to present relevant evidence.

To defend the trial plan, plaintiffs analogize to Teamsters v. United States (1977) 431 U.S. 324 [52 L.Ed.2d 396, 97 S.Ct. 1843] (Teamsters) and argue wage and hour defendants have no right to litigate an exemption defense as to each class member “during the liability phase” of trial. The analogy does not hold. Teamsters was a title VII disparate treatment case. (Teamsters, at p. 328.) The ultimate question was not whether the employer had treated any individual unfairly, but instead whether the employer had engaged in a pattern and practice of disparate treatment of racial and ethnic minorities. (Id. at p. 335.) The Supreme Court outlined a two-step process for such litigation. First, the plaintiff must make a prima facie case that the employer has a regular policy or practice of unlawful discrimination. (Id. at p. 360.) If it does so, an inference of discrimination as to all class members arises. The burden then shifts to the employer to defeat the inference by showing that any individual employment decision was made for legitimate nondiscriminatory reasons. (Id. at pp. 360-362.) If such reasons are offered, the plaintiff may prevail by showing the proffered reasons were merely pretextual. (Id. at p. 362, fn. 50; McDonnell Douglas Corp. v. Green (1973) 411 U.S. 792, 804-806 [93 S.Ct. 1817, 36 L.Ed.2d 668].)

Plaintiffs appear to urge a similar approach in misclassification cases, with an initial “liability” phase devoted to classwide evidence of misclassification and a second “remedial” phase addressing the extent of damages or other relief to be provided to the class. In the first phase, plaintiffs assert, “[i]t would be inconsistent with the requirement of common evidence” for the employer to be permitted to litigate its exemption defense against individual class members. Disputes over individual plaintiffs’ “entitlement to relief’ would have to await the second phase of proceedings, to be considered along with arguments regarding the amount of damages to be paid.

At the outset, plaintiffs’ argument rests on a false assumption. Class actions do not create a “requirement of common evidence.” Instead, class litigation may be appropriate if the circumstances of a particular case demonstrate that there is common evidence. In any event, the attempt to apply a Teamsters approach here fails.

First, the issues involved, and the means for deciding them, are very different. In a disparate treatment case, the ultimate question is whether the employer followed a policy or practice of unlawful discrimination. (Teamsters, supra, 431 U.S. at p. 335.) The focus of the different phases is on proving the employer’s discriminatory intent. In a misclassification case, whether a given employee is properly classified depends in large part on the employee’s individual circumstances, not the employer’s intent. In a pattern and practice case, the employer’s actions must be examined in the aggregate to determine whether the employer is liable to any particular plaintiff for discrimination. (See Nagareda, Class Certification in the Age of Aggregate Proof (2009) 84 N.Y.U. L.Rev. 97, 117, 151.) Statistical proof can be useful, even necessary, to illuminate the defendant’s wrongdoing. The same is true in securities fraud class actions and mass tort cases based on epidemiological evidence. (Id. at pp. 117-118.) This rationale for aggregate proof simply has no application in wage and hour litigation alleging misclassification. Although common proof may be possible if there are uniform job requirements or policies, an employer’s liability for misclassification under most Labor Code exemptions will depend on employees’ individual circumstances. Liability to one employee is in no way excused or established by the employer’s classification of other employees.

Moreover, the scope of injuries caused by the defendant’s conduct differs in the two types of cases. “In a discrimination case, it is a reasonable possibility that all class members, even those who have fared exceedingly well, were subject to this unlawful policy or practice. . . . [¶] This dimension is absent in a misclassification case. That is, it is meaningless to suggest that an employee who is performing exempt duties would have been even more exempt had the employer not engaged in the standard practice of misclassification. Further, because the employer already bears the burden of establishing that each employee qualifies as exempt, nothing is accomplished procedurally by litigating the pattern and practice claim. The employer still must prove, class member by class member, that the particular employee qualified as exempt.” (King & Muraco, Classwide Determinations of Overtime Exemptions: The False Dichotomy Posed by Sav-On and a Suggested Solution (2006) 21 Lab. Law. 257, 268-269 (hereafter King & Muraco).)

Second, plaintiffs’ vision of a wage and hour class action trial conflates liability and damages. All phases of the Teamsters model are geared toward proving the defendant’s liability for discrimination, which is generally suffered by all group members regardless of their individual circumstances. Decisions about the amount of damages owed for discriminatory conduct are entirely separate; they are not addressed in the burden-shifting framework at all. In plaintiffs’ view of wage and hour class actions, disputes about the defendant’s liability to any particular employee must be decided in correlation with decisions about the amount of damages owed to the class. In other words, decisions about the fact of liability are reffamed as questions about the extent of liability. However, as the trial below demonstrated, once the court has decided against the defendant in phase one, it is all too easy to presume the defendant’s liability to all class members in phase two.

This is not to say that an employer’s liability for misclassification may never be decided on a classwide basis. A class action trial may determine that an employer is liable to an entire class for misclassification if it is shown that the employer had a consistently applied policy or uniform job requirements and expectations contrary to a Labor Code exemption, or if it knowingly encouraged a uniform de facto practice inconsistent with the exemption. (See, e.g., Bell, supra, 115 Cal.App.4th at p. 743.) In such a case, the evidence for uniformity among class members would be strong, and common proof would be sufficient to call for the employer to defend its claimed exemption. However, any procedure to determine the defendant’s liability to the class must still permit the defendant to introduce its own evidence, both to challenge the plaintiffs’ showing and to reduce overall damages. No case, to our knowledge, holds that a defendant has a due process right to litigate an affirmative defense as to each individual class member. However, if liability is to be established on a classwide basis, defendants must have an opportunity to present proof of their affirmative defenses within whatever method the court and the parties fashion to try these issues. If trial proceeds with a statistical model of proof, a defendant accused of misclassification must be given a chance to impeach that model or otherwise show that its liability is reduced because some plaintiffs were properly classified as exempt.

D. Sampling Techniques Must Satisfy Realistic Guidelines to Minimize the Risk of Error

Questions about the use of statistical evidence to prove classwide liability and damages are far from settled. The court below adopted a trial plan that sought to prove liability and damages by extrapolation from the testimony of a small sample group. In addition to the undue restrictions this plan placed on USB’s ability to support its exemption defense, the court’s attempt to implement random sampling was beset by numerous problems.

Sampling is a methodology based on inferential statistics and probability theory. “The essence of the science of inferential statistics is that one may confidently draw inferences about the whole from a representative sample of the whole.” (In re Chevron U.S.A., Inc. (5th Cir. 1997) 109 F.3d 1016, 1019-1020.) Whether such inferences are supportable, however, depends on how representative the sample is. “[inferences from the part to the whole are justified [only] when the sample is representative.” (Kaye & Freedman, Reference Guide on Statistics in Reference Manual on Scientific Evidence (3d ed. 2011) pp. 211, 216-217.) Several considerations determine whether a sample is sufficiently representative to fairly support inferences about the underlying population.

1. Use of Sampling to Prove Misclassification Liability

It is an open question, hotly contested among the parties and amici curiae, whether statistical sampling can legitimately be used to prove a defendant’s liability to absent class members. The question has arisen in numerous contexts, ranging from mass torts (e.g., Cimino v. Raymark Industries, Inc. (5th Cir. 1998) 151 F.3d 297, 319-320) to employment discrimination (e.g., Wal-Mart Stores, Inc. v. Dukes, supra, 564 U.S. at pp. - [131 S.Ct. at pp. 2560-2561]). In the wage and hour context, recent decisions from federal district courts have disagreed about whether statistical sampling may be used to prove liability. (Compare Dilts v. Penske Logistics, LLC (S.D.Cal. 2010) 267 F.R.D. 625, 638 [approving the potential use of statistical sampling to prove classwide damages and liability for off-the-clock claims] with In re Wells Fargo Home Mortgage Overtime Pay Litigation (N.D.Cal. 2010) 268 F.R.D. 604, 612 [rejecting sampling proposal to prove misclassification liability due to the difficulty of extrapolating mixed results] & Weigele v. FedEx Ground Package System, Inc. (S.D.Cal. 2010) 267 F.R.D. 614, 623-625 [rejecting sampling proposal due to the potentially high margin of error and lack of rational guidelines for extrapolating results to the class].)

One published California case describes the successful use of statistical sampling in the trial of a wage and hour class action. In Bell, supra, 115 Cal.App.4th 715, insurance claims representatives sued their employer for unpaid overtime, recovering a class judgment of over $90 million. The court below relied heavily on Bell in developing the trial plan here. But, in doing so, it failed to note one critical distinction. The statistical evidence in Bell was heard only after classwide liability had been established. At the summary adjudication stage, the court ruled that the administrative exemption did not apply to any plaintiff; therefore, all had been misclassified as exempt. (Id. at pp. 720-721.) The employer unsuccessfully challenged that ruling on appeal. (Bell v. Farmer’s Ins. Exchange (2001) 87 Cal.App.4th 805 [105 Cal.Rptr.2d 59], review den. June 20, 2001, cert. den. sub nom. Farmer’s Ins. Exchange v. Bell (2001) 534 U.S. 1041 [151 L.Ed.2d 539, 122 S.Ct. 616].) On remand, sampling was used to prove damages only. (See Bell, supra, 115 Cal.App.4th at pp. 721-722.) Because all class members were nonexempt, the only unresolved question was how many overtime hours they worked. After several months and an unusual degree of cooperation, the parties’ experts developed a statistical sampling plan to answer the question. (Id. at pp. 722-723.)

The issues here were far more complex, encompassing both liability and damages. When the court announced its sampling plan, liability remained disputed, the parties disagreed on a sampling approach, and, as we explain, the court-devised method was statistically flawed.

We need not reach a sweeping conclusion as to whether or when sampling should be available as a tool for proving liability in a class action. It suffices to note that any class action trial plan, including those involving statistical methods of proof, must allow the defendant to litigate its affirmative defenses. If a defense depends upon questions individual to each class member, the statistical model must be designed to accommodate these case-specific deviations. If statistical methods are ultimately incompatible with the nature of the plaintiffs’ claims or the defendant’s defenses, resort to statistical proof may not be appropriate. Procedural innovation must conform to the substantive rights of the parties.

2. Use of Sampling to Prove Misclassification Damages

The use of statistical sampling to prove damages in overtime class actions is less controversial, largely due to the relaxed standard of proof established by the Supreme Court in Mt. Clemens, supra, 328 U.S. 680. In addition, broadly speaking, the law tolerates more uncertainty with respect to damages than to the existence of liability. “Uncertainty of the fact whether any damages were sustained is fatal to recovery, but uncertainty as to the amount is not.” (Bruckman v. Parliament Escrow Corp. (1987) 190 Cal.App.3d 1051, 1061 [235 Cal.Rptr. 813].)

Mt. Clemens was a suit for unpaid overtime under the Fair Labor Standards Act of 1938 (FLSA) (29 U.S.C. § 216(b)). As with California’s analogous labor laws, the plaintiff in an FLSA action has the burden of proving he or she was not properly compensated for work performed. (Mt. Clemens, supra, 328 U.S. at pp. 686-687.) Such proof is almost impossible to establish, however, if the employer has not kept accurate or adequate records. Recognizing this problem, the Supreme Court held that when an employer’s records are inaccurate or incomplete, the employee carries this burden by proving the amount and extent of work performed “as a matter of just and reasonable inference.” (Id. at p. 687) “The burden then shifts to the employer to come forward with evidence of the precise amount of work performed or with evidence to negative the reasonableness of the inference to be drawn from the employee’s evidence. If the employer fails to produce such evidence, the court may then award damages to the employee, even though the result be only approximate.” (Id. at pp. 687-688.) Under this burden-shifting framework, an employer is not allowed to benefit from its own poor recordkeeping. (Id. at p. 687.)

Some federal cases have relied on Mt. Clemens in using random sampling to prove the reasonable, if approximate, amount of compensation owed to the class. (See Morgan v. Family Dollar Stores, Inc. (11th Cir. 2008) 551 F.3d 1233, 1277-1279.) The Mt. Clemens rationale is not sufficient to support the use of sampling to prove an employer’s liability, however. The high court stressed that employees must carry the initial burden of proving they have performed work that was not compensated in accordance with applicable labor laws. (Mt. Clemens, supra, 328 U.S. at p. 688.) Once injury is established, “[t]he uncertainty lies only in the amount of damages arising from the statutory violation by the employer.” (Ibid.)

Moreover, any compensation awarded to the class must be based solely on overtime hours worked by nonexempt employees. Overtime hours worked by exempt employees are irrelevant. If a sampling plan used to calculate damages cannot distinguish exempt from nonexempt employees, it may be difficult to obtain an accurate estimate of overtime owed to the class. (See King & Muraco, supra, 21 Lab. Law. at pp. 265-266.)

3. Sampling Errors in the Trial Below

Even when statistical methods such as sampling are appropriate, due concern for the parties’ rights requires that they be employed with caution. Here, the process failed.

a. Sample Size Was Too Small

The first and most obvious flaw in the trial court’s sampling plan was the size of the sample group. A sample must be sufficiently large to provide reliable information about the larger group. “How many cases need to be sampled? This depends in large part on the variability of the population. The more diverse the population, the larger the sample must be in order to reflect the population accurately. The more homogeneous the population, the fewer cases that need to be sampled. ...[][] In general, the larger the sample, the more likely it will reflect the population; the smaller the sample, the less likely it is to do so-for any given degree of heterogeneity.” (Saks & Blanck, Justice Improved: The Unrecognized Benefits of Aggregation and Sampling in the Trial of Mass Torts (1992) 44 Stan. L.Rev. 815, 842, fn. omitted (hereafter Saks & Blanck).) It is impossible to determine an appropriate sample size without first learning about the variability in the population. (Lahav, The Case for “Trial by Formula” (2012) 90 Tex. L.Rev. 571, 630 (hereafter Lahav).) One way to assess population variability is through the use of surveys. (Id. at p. 631.)

For reasons that are not clear, the court here chose a sample size of 20, plus the two named plaintiffs. It did so without input from either side’s statistical experts. It purported to rely on Bell, supra, 115 Cal.App.4th 715. Yet in Bell both sides’ experts had worked for months to develop a mutually acceptable sampling plan. (Id. at pp. 722-723.) After “ ‘an initial pilot sample’ ” of 50 depositions, the experts agreed that results with a one-hour-per-week margin of error could be achieved using a sample size of 286 plaintiffs. (Id. at p. 722.) The parties ultimately deposed 295 class members and brought the margin of error down to less than an hour a week. (Id. at p. 723.) The close involvement of the parties’ experts in determining sample size led to this low margin of error and the “virtually identical calculations” of total overtime owed to the class. (Id. at p. 724.)

The proceedings here were strikingly different. Neither side’s experts had studied the variability among class members on key questions including percentage of hours worked outside or average weekly overtime. The court chose a sample size that would be convenient and manageable. The same could be said of a sample of one. Yet convenience alone cannot justify procedures that substantially curtail the parties’ ability to litigate their case. If sampling is used to estimate the extent of a party’s liability, care must be taken to ensure that the methodology produces reliable results. With input from the parties’ experts, the court must determine that a chosen sample size is statistically appropriate and capable of producing valid results within a reasonable margin of error.

b. Sample Was Not Random

A sample must be randomly selected for its results to be fairly extrapolated to the entire class. Although the trial court initially drew names for the RWG at random, several of its later rulings compromised random selection. “A ‘random sample’ is one in which each member of the population has an equal probability of being selected for inclusion in the sample.” (Saks & Blanck, supra, 44 Stan. L.Rev. at p. 821, fn. 48.) Even when selection procedures appear to be random, errors may arise that undermine randomness. For instance, nonresponse bias can occur if a sample is chosen randomly from a group containing only survey respondents. The potential for bias arises because those who do not respond have no probability of inclusion in the sample. Thus, although the participants are randomly selected from among respondents, the sample will not reflect the characteristics of members of the population who chose not to respond to the survey. (Phillips et al., What’s Good in Theory May Be Flawed in Practice: Potential Legal Consequences of Poor Implementation of a Theoretical Sample (2012) 9 Hastings Bus. L.J. 77, 90-91.)

Selection bias occurs when members of the population are chosen based on a nonrandom criterion or are selectively included or excluded from the sample group. In litigation, selection bias can occur when members of the population are allowed to opt out of the class. If plaintiffs with high-value claims opt out, the sample will be skewed toward low-value claims and may result in an unfairly low estimate of damages. (Lahav, supra, 90 Tex. L.Rev. at p. 622.) Conversely, if the opt-outs represent mainly low-value claims or plaintiffs with no valid claim, the sample results will be unfairly inflated. Self-interest may motivate class members to act in ways that will maximize the class award. Thus, “[o]ne must always suspect that any nonrandom method of picking sample cases will be skewed and therefore will be an inaccurate estimate of the population average.” (Id. at p. 624.)

Selection bias can also occur if named plaintiffs are included in the sample based not on random selection but on their status in the litigation. Certainly class counsel are entitled to select named plaintiffs in a manner that enhances their position. But that tactical choice should not compromise the statistical approach required for random sampling.

A sample that includes even a small number of interested parties can produce biased results. The impact of this error is magnified when the biased results are extrapolated to the entire population. Selection bias cannot be cured simply by increasing the size of the sample. “When a selection procedure is biased, taking a large sample does not help. This just repeats the basic mistake on a larger scale.” (Freedman et al., Statistics (4th ed. 2007) p. 335 (hereafter Freedman).)

Finally, a sample that is representative of a population when first drawn may become less so over time. (Saks & Blanck, supra, 44 Stan. L.Rev. at p. 841.) In class action litigation, such changes can occur with opt-outs or other events that change the class composition. “Attention must be paid to possible changes that could render a previously representative sample unrepresentative. When that occurs, sampling will not accurately reflect what needs to be known about the population.” (Ibid.)

Numerous rulings here undermined randomness and gave class counsel the ability to influence the cases selected to be tried in the sample group.

First, the trial court excluded one person selected for the RWG, Brian Smith, because his work activities differed from those of other BBOs and the court did not consider him to be a “true” BBO. However, Smith’s work activities may not have been unique. As a randomly selected member of the class, Smith may in fact have been representative of a sizeable, if unknown, group of BBOs whose job duties differed from the norm. The exclusion of perceived outliers “ ‘is more often associated with conscious, or perhaps unconscious, attempts to make a particular process perform as one would like it to perform rather than accepting the actual performance.’ ” (Freedman, supra, at p. 103.) By characterizing Smith as atypical and excluding his testimony, the court imposed a greater uniformity on the RWG than its “random” sampling methodology warranted.

Similarly, at the trial stage, one member of the RWG was able to exclude his own case from the sample group simply by failing to appear. Borsay Bryant did not respond to a trial subpoena and gave no testimony that could be included in the sample results. Even so, the court permitted him to recover under the class judgment. If plaintiffs are allowed to opt out of a sample group yet recover from the resulting class judgment, a plaintiff with a weaker case will be motivated to exclude himself and benefit from the higher recovery produced by the stronger cases in the sample. Bryant may not have acted with this subjective motivation, but his failure to appear was yet another instance of selection bias that compromised the randomness of the sample.

Next, after the court drew names for the RWG, it granted plaintiffs’ motion to amend the complaint and ordered the parties to give all class members, including those selected for the RWG, a second opportunity to opt out of the proceedings. Nothing prevented class counsel from assessing the strength of cases selected for the RWG and encouraging plaintiffs with weak liability or damages claims to opt out. In fact, two class members who had given favorable deposition testimony for USB submitted declarations stating that class counsel strongly encouraged them to opt out after they were selected for the RWG. Ultimately, a far greater percentage of opt-outs came from the 20 people selected by the court than from the more than 200 other class members. Four of the nine plaintiffs who opted out were in the RWG. Thus, one in five of the 20 RWG members opted out whereas fewer than one in 50 class members from outside the RWG did so. Despite a declaration from USB’s expert explaining that this disparity in opt-out rates could not be attributed to chance and created a hopelessly biased sample, the court refused to restore the RWG opt-outs to the sample group or, even better, draw a new random sample from the entire class.

Finally, the court decided to base its extrapolation not just on those drawn randomly, but also on testimony from the two named plaintiffs, Duran and Fitzsimmons. These plaintiffs were selected, not randomly, but by class counsel. Indeed, class counsel substituted the lead plaintiffs repeatedly. The original named plaintiff, Amina Rafiqzada, was replaced early in the litigation and two months later testified in deposition that she spent more than half her workday outside the office. The next three class representatives (Vanessa Haven, Abby Karavani, and Parham Shekarlab) were replaced after they testified that they spent from 60 to 90 percent of the workday outside the office. Plaintiffs’ expert conceded Duran and Fitzsimmons were not technically part of the random sample and, as a result, found it impossible to say whether they were representative of the class. The trial court, too, acknowledged that data derived from the named plaintiffs’ testimony would not normally be included in a statistical analysis of a random sample, but it excused this deviation by fiat, declaring that the court was “the final arbiter of what is representative of the class.” Trial courts have broad discretion in many areas. But they cannot exercise that discretion in ways contrary to the internal rules of a scientific specialty, such as statistics, and then rely on that specialty’s established reliability as if the rules had been followed. The trial court also justified inclusion of the named plaintiffs by noting that the overtime estimate for the class was higher without their testimony. However, this change would have increased the margin of error to 47 percent, making the sample results even less reliable.

Based on testimony from the 19 class members who testified and the two named plaintiffs, the court ruled that the entire class of BBOs had been misclassified. In rigid adherence to its sampling plan, the court refused to hear potentially contrary testimony from 10 plaintiffs who would otherwise have been included in the sample: the four previously named plaintiffs, the four RWG members who opted out, the RWG member whose work habits were different, and the RWG member who failed to appear for trial. The resulting sample was not random, but appeared to be biased in plaintiffs’ favor.

c. Intolerably Large Margin of Error Resulted

“Margin of error” is a statistical measurement of the reliability of an estimate produced by sampling. It reflects the amount by which the estimate may be wrong given a certain confidence interval. Plaintiffs’ expert Drogin defined margin of error as the “plus or minus” amount associated with the estimate at a given level of confidence. Statisticians typically calculate margin of error using a 95 percent confidence interval, which is the interval of values above and below the estimate within which one can be 95 percent certain of capturing the “true” result. The margin of error is simply one-half of the confidence interval; it is the amount by which the sampling estimate may be either too high or too low. (See Bell, supra, 115 Cal.App.4th at p. 753.)

Here, testimony from the skewed sample group produced an estimate of 11.86 hours of average weekly overtime. Drogin calculated a margin of error of 5.14 hours, or 43.3 percent, at a 95 percent confidence interval. In other words, he could be 95 percent confident that, on average, RWG members worked as little as 6.72 hours and as much as 17 hours of overtime each week. This means that the actual average overtime worked by class members could have been 43.3 percent higher (17) or lower (6.72) than the 11.86-hour estimate. Manifestly, the range of potential compensation for plaintiffs in the sample was substantial. The wide range of error means that a judgment based on the estimate could reflect an amount nearly double the true extent of USB’s liability. As even plaintiffs acknowledge, this margin of error was intolerably high.

In Bell, supra, 115 Cal.App.4th at pages 756 to 757, the same Court of Appeal that decided this case concluded a 32 percent margin of error was so large that the resulting damages award violated due process. The court went on to observe, in dicta, that “[t]he reliability of an estimate subject to a large margin of error might conceivably be bolstered by evidence of a high response rate, probable distribution within the margin of error, absence of measurement error, or other matters.” (Id. at p. 756.) The trial court here cited Bell’s “bolstering” factors to justify reliance on Drogin’s overtime estimate despite the 43.3 percent margin of error. Specifically, the court asserted that reliability of the estimate was enhanced by: (1) the random selection of the sample; (2) the high response rate of RWG members who testified at trial; (3) the lack of measurement error, in that the sample results accurately reflected the court’s findings; (4) the anecdotal evidence and survey data corroborating the estimate; (5) the absence of outliers in the sample; (6) the design of the trial plan; and (7) the court’s previous consideration of alternative procedures.

We question whether such ancillary indicia of reliability could ever be sufficient to justify reliance on an estimate with a margin of error approaching 50 percent. In any case, the trial court’s assessment was at odds with the facts. (1) Contrary to the court’s findings, the randomness of the sample group was repeatedly compromised, and the results obtained for the RWG were marred by selection bias. (2) The response rate for the sample group cannot reasonably be considered high given that one RWG member did not appear for trial, four opted out when given the opportunity, and one was removed by the court. (3) The sample’s results almost certainly included a degree of measurement error. Witnesses’ recollections of their past overtime were bound to be imprecise. Moreover, several RWG members testified to a range of overtime hours without specifying where in the range they typically worked. In such cases, Drogin chose to rely on the midpoint of the range, but this measure would have been inaccurate if the witness typically worked at the high or low end. (4) The court’s assertion that the sample results were corroborated by survey data amounts to little more than bootstrapping. When plaintiffs’ expert, Jon Krosnick, surveyed class members after phase one of the trial, the court ruled the survey data inadmissible as a violation of its trial plan and refused to allow USB to conduct any discovery into the design or execution of the survey. Having excluded the survey data and denied USB an opportunity to impeach it, the court could not fairly rely on those data to bolster the reliability of the sample results. (5) As explained ante at page 44, the exclusion of one BBO whom the court considered an “outlier” could well have undermined, rather than enhanced, reliability. As to factors (6) and (7), it is unclear how the court’s trial plan or consideration of other procedures could be said, as a matter of statistical rigor, to have made the estimate more reliable.

Nor can the damages estimate here be salvaged by favorable statements in Bell, supra, 115 Cal.App.4th 715 about the utility of sampling to prove classwide damages. The sampling plan in Bell was developed by experts with a significant degree of cooperation among the parties, and the court specifically held that it offered an appropriate means of proving damages under the Mt. Clemens “ ‘just and reasonable inference’ ” standard. (Bell, at p. 758; see Mt. Clemens, supra, 328 U.S. at p. 687.) Moreover, Bell involved a random sample of nearly 300 and a margin of error under 10 percent. (Bell, at pp. 753-755.)

Plaintiffs respond to these points with two arguments. First, they contend USB waived any challenge to the 43.3 percent margin of error because the bank refused to agree to alternative procedures suggested by the trial court to reduce the margin of error. After the first phase of trial, the court said it would entertain argument on three alternatives for phase two: (1) proceed with expert testimony alone based on the phase one findings; (2) require non-RWG class members to prove up their overtime amounts in some form of streamlined procedure; or (3) consider additional survey evidence bearing on the restitution calculation. The court stressed that these alternatives were limited to the issue of restitution and could not be used to reexamine the phase one finding that all class members had been misclassified. USB objected to this limitation, noting that the issues of liability and restitution were linked because any plaintiff properly classified as exempt would not be entitled to restitution. Instead, USB urged the court to conduct focused trials for each class member to determine both liability and restitution. The court rejected this proposal and proceeded with option (1), the course favored by plaintiffs.

USB did not waive its right to seek relief on appeal. Forfeiture of issues on appeal typically occurs when a party fails to object. (9 Witkin, Cal. Procedure (5th ed. 2008) Appeal, § 396, p. 453.) The related doctrines of waiver and invited error prevent a party from taking advantage of an error that could have been corrected earlier if brought to the trial court’s attention. (In re Marriage of Arceneaux (1990) 51 Cal.3d 1130, 1133-1134 [275 Cal.Rptr. 797, 800 P.2d 1227].) USB consistently and vigorously objected to the sampling plan, and acted within its rights in doing so. Litigants must of course obey court orders, but they are not obligated to agree to unfair procedures simply because they are convenient for the court. To conclude there was a waiver here would stretch the concept beyond recognition.

Second, although they now concede the 43.3 percent margin of error is too high to support the judgment, plaintiffs stress that this margin of error did not infect the court’s phase one finding of classwide liability. Drogin’s estimate that 100 percent of the class was misclassified was subject to a margin of error of only 13 percent. However, even if plaintiffs’ expert calculated a lower margin of error for the liability estimate, the sampling plan that produced this estimate was tainted by selection bias and the other problems noted. In light of these sampling errors, the true margin of error may have been considerably higher. In any event, the trial court ignored the margin of error entirely when it ruled that USB was liable to all class members, even though a number of class members admitted facts establishing they had been properly classified. A lower margin of error on the liability estimate does not make up for the high margin of error accompanying the restitution award. More fundamentally, the liability estimate cannot be trusted because it resulted from an unfair trial.

d. Sampling Errors Require Reversal

Plaintiffs urge.us to ignore the serious flaws in the sampling plan because USB’s expert Hildreth raised these criticisms in his testimony and the trial court rejected them, finding the testimony of plaintiffs’ experts more credible and persuasive. Although we generally defer to such credibility determinations, the trial court’s decision to jettison all statistically grounded criticism of the sampling plan the court itself created was not supported by substantial evidence. Here, the trial court invented its own sampling methodology, without input from the parties’ experts. It then adamantly adhered to this methodology, rejecting substantial expert criticism.

We need not resolve here whether statistical sampling can ever be used in a misclassification action to prove an employer’s liability to absent class members. Assuming that sampling may be an appropriate means of proving liability or damages in a wage and hour class action, the sample relied upon must be representative and the results obtained must be sufficiently reliable to satisfy concerns of fundamental fairness. These conditions were not satisfied here.

E. Conclusion

While class certification can be a useful tool for deciding common questions, we have repeatedly explained that remaining individual issues must be fairly managed. In some cases, statistical methods may offer a reasonable and appropriate way to do so. (See, e.g., Bell, supra, 115 Cal.App.4th 715.) However, reliance on statistical proof cannot be used to bar the presentation of valid defenses to either liability or damages, even if the alternative would require adjudication of a defense on an individual level. When liability is to be established on a classwide basis, the defendant must have an opportunity to present proof of affirmative defenses within whatever method the court and parties fashion to try these issues. If the trial proceeds with a statistical model of proof, a defendant accused of misclassification must be given a chance to impeach that model or otherwise show that its liability is reduced because some plaintiffs were properly classified as exempt.

We review class action trial management decisions for abuse of discretion. (Fireside Bank v. Superior Court (2007) 40 Cal.4th 1069, 1087 [56 Cal.Rptr.3d 861, 155 P.3d 268].) The trial court’s exclusion of all evidence about the work habits of BBOs outside the sample group and its implementation of a biased sampling plan were manifestly an abuse of the court’s discretion. The errors require reversal if it is reasonably probable that they affected the verdict. (College Hospital Inc. v. Superior Court (1994) 8 Cal.4th 704, 715 [34 Cal.Rptr.2d 898, 882 P.2d 894]; see People v. Watson (1956) 46 Cal.2d 818, 836 [299 P.2d 243].) “We have made clear that a ‘probability’ in this context does not mean more likely than not, but merely a reasonable chance, more than an abstract possibility. [Citations.]” (College Hospital, at p. 715.) Here, there can be little question that the trial court’s findings on liability and damages would have been different absent its erroneous exclusion of evidence and reliance on faulty statistical methodology. Accordingly, both aspects of the class judgment must be reversed.

III. DISPOSITION

The Court of Appeal’s judgment is affirmed in its entirety. On remand, a new trial will be required for both liability and restitution, and the trial court may entertain a new class certification motion.

Cantil-Sakauye, C. J., Baxter, J., Werdegar, J., Chin, J., Liu, J., and Kennard, J., concurred.

LIU, J.,

Concurring. — It is not difficult to understand why the trial court’s sampling plan in this case was “profoundly flawed.” (Maj. opn., ante, at p. 13.) The representative witness group was not selected at random but in a manner biased in plaintiffs’ favor. The trial court used no known statistical rationale in picking a sample size of 20, and there is no reason to think the sample was sufficiently large. The trial court also tolerated a margin of error at the damages phase that was undoubtedly too large. These errors require reversal of both the liability phase and restitution phase judgments.

At the same time, today’s opinion takes an appropriately cautious approach to guiding the conduct of class action trials in employee misclassification cases and, in particular, the use of statistical methods in such trials. The court disavows any “sweeping conclusion as to whether or when sampling should be available as a tool for proving liability in a class action,” while emphasizing that any trial plan involving statistical methods “must allow the defendant to litigate its affirmative defenses.” (Maj. opn., ante, at p. 40; see id. at p. 35 [“While representative testimony and sampling may sometimes be appropriate tools for managing individual issues in a class action, these statistical methods cannot so completely undermine a defendant’s right to present relevant evidence.”].) The court warns that “decisions about the fact of liability” should not be “reframed as questions about the extent of liability,” and then adds that “[t]his is not to say that an employer’s liability for misclassification may never be decided on a classwide basis.” (Id. at p. 37.)

Consistent with our settled precedent, today’s opinion continues to encourage trial courts to be “ ‘procedurally innovative’ ” in managing class actions and leaves open “the appropriate use of representative testimony, sampling, or other procedures employing statistical methodology.” (Maj. opn., ante, at p. 33; see Sav-On Drug Stores, Inc. v. Superior Court (2004) 34 Cal.4th 319, 339-340 [17 Cal.Rptr.3d 906, 96 P.3d 194] (Say-On)-, City of San Jose v. Superior Court (1974) 12 Cal.3d 447, 453 [115 Cal.Rptr. 797, 525 P.2d 701].) Here I offer a few comments to further elucidate the proper inquiry at the class certification stage of an employee misclassification case and the duty of trial courts to manage individual issues in a class action trial.

I.

The threshold task for determining whether a class action is appropriate in a particular case is to inquire whether the substantive law governing the plaintiffs’ claims renders those claims amenable to class treatment. Because disputes over the facts or methods of proof that bear on class certification are often, in reality, disputes over “the substantive law that governs the litigation,” it is important that courts employ a proper understanding of the substantive governing law to inform the class certification decision, and not the other way around. (Nagareda, Class Certification in the Age of Aggregate Proof (2009) 84 N.Y.U. L.Rev. 97, 104; see id. at pp. 105-106 [“This is not to suggest that class actions- — any more or less than conventional, individual lawsuits — cannot serve as vehicles for change in legal doctrine. It is simply to say that the proposed class-wide nature of the litigation should exert no independent weight in arguments for such change.” (fn. omitted)].) The exposition of substantive law should be independent of the fact that “the case at hand happens to take a proposed aggregate form.” (Id. at p. 108.)

The question in this case is whether the employees in the proposed class are “outside salespersons” exempt from the state’s overtime laws. An “ ‘[o]utside salesperson’ ” is one “who customarily and regularly works more than half the working time away from the employer’s place of business selling tangible or intangible items or obtaining orders or contracts for products, services or use of facilities.” (Industrial Welfare Com., wage order No. 4-2001, subd. 2(M) (Wage Order No. 4-2001, subd. 2(M)); Cal. Code Regs., tit. 8, § 11040, subd. 1(C).) We set forth an authoritative construction of the term in Ramirez v. Yosemite Water Co. (1999) 20 Cal.4th 785 [85 Cal.Rptr.2d 844, 978 P.2d 2] (Ramirez), a case having nothing to do with class actions. The central dispute in Ramirez was whether the outside salesperson exemption should be construed in the same manner as an analogous federal exemption, which “focuses on defining the employee’s ‘primary function,’ not on how much work time is spent selling.” (Id. at p. 797.) Rejecting that view, we concluded that the relevant wage order “incorporates a quantitative method for determining whether an employee is an outside salesperson that differs in some respect from the qualitative method employed under federal law.” (Id. at p. 798.)

In elaborating upon “California’s distinctive quantitative approach to determining which employees are outside salespersons,” Ramirez resolved a question that had confused litigants and lower courts: “Is the number of hours worked in sales-related activities to be determined by the number of hours that the employer, according to its job description or its estimate, claims the employee should be working in sales, or should it be determined by the actual average hours the employee spent on sales activity? The logic inherent in the [Industrial Welfare Commission’s] quantitative definition of outside salesperson dictates that neither alternative would be wholly satisfactory. On the one hand, if hours worked on sales were determined through an employer’s job description, then the employer could make an employee exempt from overtime laws solely by fashioning an idealized job description that had little basis in reality. On the other hand, an employee who is supposed to be engaged in sales activities during most of his working hours and falls below the 50 percent mark due to his own substandard performance should not thereby be able to evade a valid exemption. A trial court, in determining whether the employee is an outside salesperson, must steer clear of these two pitfalls by inquiring into the realistic requirements of the job. In so doing, the court should consider, first and foremost, how the employee actually spends his or her time. But the trial court should also consider whether the employee’s practice diverges from the employer’s realistic expectations, whether there was any concrete expression of employer displeasure over an employee’s substandard performance, and whether these expressions were themselves realistic given the actual overall requirements of the job.” (Ramirez, supra, 20 Cal.4th at pp. 801-802.)

Thus, in recognizing that California’s definition of an outside salesperson is quantitative in nature, Ramirez did not say that the test boils down to whether a particular employee actually spends more than 50 percent of his or her working hours on outside sales. Instead, the ultimate question is: what are “the realistic requirements of the job”? (Ramirez, supra, 20 Cal.4th at p. 802.) The primary consideration that informs this inquiry is “how the employee actually spends his or her time.” (Ibid.) But, as Ramirez made clear, this factor is not dispositive because an employee who falls below the 50 percent threshold “should not thereby be able to evade a valid exemption” if the employee “is supposed to be engaged in sales activities during most of his working hours.” (Ibid.) By the same logic, an employee who exceeds the 50 percent threshold should not be classified as exempt if devoting that much time to outside sales is not a realistic requirement of the job. Ramirez’s focus on “the realistic requirements of the job” parallels the wage order’s definition of an outside salesperson as one who not only “regularly” but also “customarily” spends more than half the working time on outside sales activity. (Wage Order No. 4-2001, subd. 2(M), italics added.) How an employee actually spends his or her time is certainly probative of what is customary or realistically required in the performance of a particular job. But so are “whether the employee’s practice diverges from the employer’s realistic expectations, whether there was any concrete expression of employer displeasure over an employee’s substandard performance, and whether these expressions were themselves realistic given the actual overall requirements of the job.” (Ramirez, at p. 802.)

Once we have brought into focus the ultimate issue of “the employer’s realistic expectations” or “the realistic requirements of the job” (Ramirez, supra, 20 Cal.4th at p. 802), it is not difficult to contemplate that employees in a given job classification will often be either wholly exempt or wholly nonexempt, since a job classification often entails a common set of employer expectations or requirements for performance of the job. That is not to say that trial courts should simply rely on “an employer’s job description” in deciding whether employees are outside salespersons; as Ramirez warned, “the employer could make an employee exempt from overtime laws solely by fashioning an idealized job description that had little basis in reality.” (Ibid.) How employees actually spend their time obviously matters. But Ramirez also warned that it “would [not] be wholly satisfactory” to rely solely on “the actual average hours the employee spent on sales activity.” (Ibid.) Variability in such hours does not necessarily prove that the employer’s realistic expectations or the realistic requirements of the job were not the same for all employees in a given job classification.

We addressed the implications of Ramirez for class actions in Sav-On, supra, 34 Cal.4th 319, which upheld certification of a class of drug store employees who alleged they had been misclassified as managers exempt from overtime laws. The defendant in Sav-On argued that the managerial exemption, like the outside salesperson exemption in Ramirez, turns on “ ‘the actual tasks performed by each class member, the amount of time each class member spent on those tasks, and how the class member’s practices compare to the employer’s reasonable expectations,’ ” and that such individualized factors necessarily bar class certification. (Id. at p. 335.) We rejected this argument: “Presence in a particular overtime class action of the considerations reviewed in Ramirez does not necessarily preclude class certification. Any dispute over ‘how the employee actually spends his or her time’ (Ramirez, supra, 20 Cal.4th at p. 802), of course, has the potential to generate individual issues. But considerations such as ‘the employer’s realistic expectations’ (ibid.) and ‘the actual overall requirements of the job’ (ibid.) are likely to prove susceptible of common proof. Defendant’s ‘realistic expectations,’ in particular, may become relevant in this case, and a reasonable court could conclude these are susceptible of common proof.” (Sav-On, at pp. 336-337; see maj. opn., ante, at p. 27 [“Job requirements and employer expectations of how duties are to be performed may often be established by evidence relating to a group as a whole.”].) In the present case, defense witnesses testified that all business banking officers (BBOs) were expected to spend the majority of their time on outside sales activity. (Id. at p. 19.)

Sav-On went on to say that “our observation in Ramirez that whether the employee is an outside salesperson depends ‘first and foremost, [on] how the employee actually spends his or her time’ (Ramirez, supra, [20 Cal.4th] at p. 802) did not create or imply a requirement that courts assess an employer’s affirmative exemption defense against every class member’s claim before certifying an overtime class action.” (Sav-On, supra, 34 Cal.4th at p. 337.) Such an approach, we said, would “require as a prerequisite to certification that plaintiffs demonstrate defendant’s classification policy was . . . either ‘right as to all members of the class or wrong as to all members of the class,’ ” thereby reversing the employer’s burden to prove the employee’s exemption. (Id. at p. 338.) “Ramirez is no authority for such a requirement, nor does the logic of predominance require it.” (Ibid.)

Since Sav-On, a number of Courts of Appeal have upheld denials of class certification in employee misclassification cases based on the conclusion they were not amenable to common proof. In some cases, preliminary evidence revealed that a common job classification and description did not actually reflect common employer expectations or requirements. (See, e.g., Arenas v. El Torito Restaurants, Inc. (2010) 183 Cal.App.4th 723, 734 [108 Cal.Rptr.3d 15] [affirming denial of class certification where trial court credited defense evidence that duties of a restaurant manager varied significantly from restaurant to restaurant].) But Sav-On made clear that variation in how employees spend their time does not, by itself, preclude a finding that an employer’s realistic expectations are susceptible to common proof. Here, under the relevant wage order, the ultimate question is whether BBOs “customarily and regularly” spend more than half their working time on exempt tasks. (Wage Order No. 4-2001, subd. 2(M), italics added.)

II.

As today’s opinion explains, the predominance of common issues “is not the only consideration. In certifying a class action, the court must also conclude that litigation of individual issues, including those arising from affirmative defenses, can be managed fairly and efficiently.” (Maj. opn., ante, at pp. 28-29; see Brinker Restaurant Corp. v. Superior Court (2012) 53 Cal.4th 1004, 1054 [139 Cal.Rptr.3d 315, 273 P.3d 513] (cone. opn. of Werdegar, J.) (Brinker) [“whether in a given case affirmative defenses should lead a court to approve or reject certification will hinge on the manageability of any individual issues”].)

A principal error in this case was the trial court’s refusal to consider declarations from class members outside of the representative witness group during the trial. I agree that “[i]n rigidly adhering to its flawed trial plan and excluding relevant evidence central to the defense, the court here did not manage individual issues. It ignored them.” (Maj. opn., ante, at p. 34.) What would it mean to “manage individual issues” in the context of an employee misclassification case? To aid the trial court on remand, as well as future courts in similar cases, I briefly address this question.

At the outset, it must be remembered that a declaration indicating that an employee typically spent more than 50 percent of the workday engaged in outside sales activity does not dispositively show that the employee was properly classified as exempt. Rather, such a declaration is evidence bearing on the ultimate issue of “the employer’s realistic expectations” or “the realistic requirements of the job.” (Ramirez, supra, 20 Cal.4th at p. 802.) In a trial, such evidence must be assessed for its weight and credibility, and it must be considered together with all other evidence bearing on the ultimate issue.

Further, although a representative sampling approach to proving class liability is not appropriate for all statutory rights (see Wal-Mart Stores, Inc. v. Dukes (2011) 564 U.S._[180 L.Ed.2d 374, 131 S.Ct. 2541]), the need to manage individual issues does not foreclose the use of sampling, representative testimony, or other statistical methods to obtain relevant evidence in a class action trial on employee misclassification. However, because such methods are inherently designed to reveal generalized characteristics of a population, they pose the risk that a defendant’s affirmative defenses as to individual employees will not be properly adjudicated. There are two ways that a trial court should consider individual issues in this context.

First, consideration of individual issues should inform the design of any sampling or similar statistical approach. As today’s opinion notes, “[i]t is impossible to determine an appropriate sample size without first learning about the variability in the population.” (Maj. opn., ante, at p. 42.) In other words, a valid sampling plan must take into account individual variation within the population, and in that sense, consideration of individual issues is “baked into” the plan’s design. Litigation over the degree or nature of variability in the population may result in a determination that no valid sampling plan would be practical or efficient, that multiple samples must be used in order to capture heterogeneity within the class, or that a sampling plan is viable only for a certain subset of the class.

Second, even when a trial court has settled on a valid sampling plan, the defendant is entitled to raise individual issues that challenge the results of the plan as implemented. The defendant may introduce evidence, such as individual declarations, suggesting that the sample was not truly representative or that the margin of error admits substantial variation around an average or generalized finding. Faced with evidence from individual and aggregate methods of proof, the trial court must reasonably resolve any conflicts. In so doing, the court may arrive at many possible conclusions, depending on the evidence.

As noted, the court must assess the credibility of the individualized evidence. Here, for example, the credibility of declarations as to how much time an employee spent on outside sales activity may depend on whether the employer or employee kept contemporaneous records of his or her time. Or the trial court could call some of the declarants to testify and assess whether their testimony confirmed or contradicted their declarations. The court might find that the individualized evidence lacks credibility and that the sampling evidence is reliably probative of the employer’s realistic expectations. Or the court might find that the individualized evidence is credible and casts doubt on the validity of the sampling plan as executed. In the latter case, the court might conclude that variability cannot be managed in a class proceeding and that the class should be decertified. Or the court might notice patterns that suggest unmanageable variation in particular subgroups, resulting in partial decertification. Or the court might conclude that the individualized evidence shows only a few outliers that can be handled through minitrials without disrupting the class proceeding.

Alternatively, the court might find that the individualized evidence, while credible, does not show variability in the class but rather provides strong, consistent evidence of the employer’s realistic expectations for the job at issue. Such evidence, depending on what it showed, could support a finding of exemption or nonexemption for the entire class, thereby corroborating or undercutting the sampling evidence. Or such evidence could support a finding of liability for a subset of the class, while tending to disprove liability for the remainder.

The important point is that neither an aggregate method of proof (like sampling or representative witness testimony) nor individualized evidence (like a declaration) is necessarily dispositive when the ultimate issue at trial is to determine “the employer’s realistic expectations” or “the realistic requirements of the job.” (Ramirez, supra, 20 Cal.4th at p. 802.) The two types of evidence must be considered and weighed alongside each other, and more broadly, they must be considered and weighed together with the full range of evidence bearing on the ultimate issue, including the employer’s job description, company policies, industry customs, and testimony of supervisors or managers who monitored, evaluated, or otherwise set expectations for employees in the class. We entrust our trial courts with the task of weighing such multidimensional evidence, and their judgments will be sustained if supported by substantial evidence. (Cf. Brinker, supra, 53 Cal.4th at p. 1017 [class certification upheld when supported by substantial evidence of employer’s uniform unlawful policy].)

A class action trial plan, however well conceived, cannot anticipate every possible development. The trial court must address individual issues when they arise. In so doing, the court has a great deal of discretion — from determining the weight to be given to individualized and aggregate evidence, to determining how much variability such evidence suggests there is in the class, to determining what implications such evidence has for continued certification of the class and for the ultimate merits of the case. As we said in Sav-On: “Courts seeking to preserve efficiency and other benefits of class actions routinely fashion methods to manage individual questions. For decades ‘[t]his court has urged trial courts to be procedurally innovative’ [citation] in managing class actions, and ‘the trial court has an obligation to consider the use of . . . innovate procedural tools proposed by a party to certify a manageable class’ [citations]. Such devices permit defendants to ‘present their opposition, and to raise certain affirmative defenses.’ [Citation.]” (Sav-On, supra, 34 Cal.4th at pp. 339-340, fns. omitted; see id. at p. 339, fns. 11-12 [providing numerous examples of methods to manage individual issues, including bifurcation, subclasses, questionnaires, and individualized hearings].)

III.

Today’s opinion properly identifies the shortcomings of the representative witness group in this case and the trial court’s failure to give due consideration to the individualized evidence that U.S. Bank National Association (USB) sought to introduce in its defense. But it is important to note that the trial court focused on the right question on the merits: What were the realistic requirements of the BBO position?

At trial, no party argued that USB lacked common expectations and requirements for BBOs. According to USB’s answer brief, USB presented evidence, including testimony of BBO supervisors, that it “expects BBOs to spend 80 per cent of their time on these ‘outside sales activities.’ ” Plaintiffs, on the other hand, presented evidence that the BBO position required employees to spend most of their time on telemarketing and other in-office tasks. The trial court, after hearing the evidence, made detailed findings in support of its conclusion that “it is not realistic for BBOs to spend more than half of their work time outside of bank locations because the credit or loan transaction cannot be consummated, nor the sales goal met, without substantial effort that does not or cannot be performed outside of bank locations.”

Such findings, if based on substantial evidence, ordinarily would be sufficient to show the nonexempt status of employees under the relevant wage order. In this case, however, we cannot have confidence in such findings because the trial court did not use a valid representative witness group or consider individualized evidence that might have presented a more complete picture of the class. On remand, the trial court must start anew by assessing whether there is a trial plan that can properly address both common and individual issues if the case were to proceed as a class action. 
      
       This position was previously called “Small Business Banker.” It was renamed after a merger in 2001, but the duties of the position did not change. To avoid confusion, we refer to all employees in this position, both before and after 2001, as BBOs.
     
      
       Labor Code section 1171 states that statutes governing wages, hours, and working conditions, including the requirement of overtime pay, “shall apply to and include men, women and minors employed in any occupation, trade, or industry, whether compensation is measured by time, piece, or otherwise, but shall not include any individual employed as an outside salesman . . . .” (Italics added.)
     
      
       Plaintiffs alleged the same conduct also violated Business and Professions Code section 17200 and constituted conversion.
     
      
       The original named plaintiff, Amina Rafiqzada, was replaced in February 2003. A little over two months later, Rafiqzada testified that she spent approximately 60 percent of her workday outside the branch office. The second group of named plaintiffs (Vanessa Haven, Abby Karavani, and Parham Shekarlab) testified in June 2003 that they spent from 60 to 90 percent of the workday outside the office. These three were replaced in a second amendment to the complaint on March 14, 2005, just days before the court ruled on the first round of certification motions.
     
      
       Four of the 75 individuals who gave declarations for USB later provided contradictory declarations supporting the lawsuit. Although plaintiffs strenuously challenge the authenticity of these declarations, they were not excluded from consideration, at the certification stage or later, on this ground. (See post, fn. 10.)
     
      
       We use the terms “outside sales” or “outside activities” to describe work done away from a bank office.
     
      
       USD’s petitions for writ of mandate in the Court of Appeal and for review in this court were denied.
     
      
       It appears these alternates were later used to replace RWG members who opted out and one who was removed by the court. (See post, at pp. 16-17, 44.)
     
      
      
        Walsh v. IKON Office Solutions, Inc. (2007) 148 Cal.App.4th 1440 [56 Cal.Rptr.3d 534]; Dunbar v. Albertson’s, Inc. (2006) 141 Cal.App.4th 1422 [47 Cal.Rptr.3d 83]; Jimenez v. Domino’s Pizza, Inc. (C.D.Cal. 2006) 238 F.R.D. 241.
     
      
       These appear to be the same declarations USB offered when it opposed class certification. Plaintiffs disputed the authenticity of these declarations and asserted defense counsel had coerced some class members into signing them. When the trial court later ruled on in limine motions, it did not exclude USB’s declarations but cautioned that their weight would be “adjusted because of their authorship, the circumstances of preparation!,] and internal inconsistencies and ambiguities.”
     
      
       One RWG member, Borsay Bryant, did not respond to a trial subpoena. There was no indication why he ignored the court’s order. The court found that Bryant had not waived his claim for overtime compensation but would instead be treated like other class members who did not testify, i.e., those outside the RWG, and presumed to be nonexempt.
     
      
       “Weight” simply reflects the total number of weeks the RWG members worked overtime. So, for example, if one RWG member worked 10 hours of overtime for three weeks and another worked 20 hours of overtime for one week, the weighted average of their overtime would be 50 hours (30 hours for RWG member one plus 20 hours for RWG member two) divided by the four total weeks worked, for an average of 12.5 overtime hours per week.
     
      
       The relative margin of error is the percentage of deviation possible around the point estimate at a specific confidence interval. Here, the point estimate was 11.87 hours, and deviation around this estimate ranged from 6.73 to 17.01 (i.e., plus or minus 5.14 hours) at a 95 percent confidence interval. The relative margin of error is calculated by dividing the absolute margin of error by the point estimate. In this case: 5.14/11.87 = 43.30 percent.
     
      
       The trial court also prohibited USB from introducing timesheets showing that some class members spent time working in BBO training positions that were properly classified as nonexempt. Even though the class members clearly had no valid misclassification claim for these training weeks, the court reasoned that any evidence pertaining to class members outside the RWG, even evidence directly negating liability, violated its trial plan.
     
      
       All such predictions by the experts were made at a 95 percent confidence interval.
     
      
       USB’s expert criticized Drogin’s use of a midpoint because these calculations assumed, without foundation, that an employee worked the same number of weeks at the high end of the stated range as at the low end. For example, an employee who typically worked 10 overtime hours a week and only occasionally worked 20 overtime hours would be assigned a midpoint of 15, whereas a more accurate figure for the employee’s average overtime would be closer to 10.
     
      
       In other words, a member of the RWG who had worked little overtime, or worked outside the office frequently enough to be exempt, could remove herself from the sample group in the hope that her replacement would offer testimony more favorable for the class.
     
      
       Nineteen is the number of RWG members who testified at trial, excluding the nonrandomly selected named plaintiffs who testified.
     
      
       At an average compensation of $57,000, an unsupported award to 36 class members amounts to over $2 million.
     
      
       Although Drogin’s calculations yielded an estimate of 11.87, the court found that class members worked an average of 11.86 hours of overtime per week. We therefore use 11.86 in all further discussion of the overtime estimate.
     
      
       The court separately awarded $25,373 to the named plaintiffs for meal and rest break violations.
     
      
       The same court, Division One of the First District Court of Appeal, had previously decided Bell, supra, 115 Cal.App.4th 715.
     
      
       The first was Bell, supra, 115 Cal.App.4th 715, which resulted in a classwide verdict that was almost entirely upheld on appeal. As we will discuss, however, this case differs from Bell in significant particulars.
     
      
       Pursuant to authority granted by various Labor Code provisions, the Industrial Wage Commission regulates wages and working hours in various industries through wage orders. (See Harris v. Superior Court (2011) 53 Cal.4th 170, 176-177 [135 Cal.Rptr.3d 247, 266 P.3d 953]; Martinez v. Combs (2010) 49 Cal.4th 35, 55-57 [109 Cal.Rptr.3d 514, 231 P.3d 259].)
     
      
       Early in the litigation, USB argued BBOs could also fall within the administrative and commissioned salesperson exemptions (Wage Order No. 4-2001, subds. 1(A)(2), 3(D)). These affirmative defenses were dismissed before trial on plaintiffs’ motion for summary adjudication. The trial court also ruled that California law does not permit the “tacking” of time worked under multiple exceptions to meet the required 50 percent threshold for exempt time. These rulings were challenged in the Court of Appeal, but that court did not reach them. They are not before us here.
     
      
      
        Ramirez construed the 1980 predecessor to present Industrial Welfare Commission wage order No. 7-2001 (Cal. Code Regs., tit. 8, § 11070), which governs wages, hours, and working conditions in the mercantile industry. Although this case is governed by a different regulation, the wage orders’ definitions of “outside salesperson” are the same.
     
      
       The vast majority of cases settle after a class action is certified. In a 2010 study conducted by the Administrative Office of the Courts, 89 percent of cases certified as a class action ended in settlement, compared with 15 percent of cases in which certification was denied. (Administrative Off. of the Courts, Off. of Court Research, Class Certification in Cal., Second Interim Report from the Study of Cal. Class Action Litigation (Feb. 2010) p. 23 <http://www.courts.ca.gov/documents/classaction-certification.pdf> [as of May 29, 2014].)
     
      
       In regard to other wage and hour claims, some courts have held that the absence of a uniform policy supports certification if such a policy is required by law. (See, e.g., Benton v. Telecom Network Specialists, Inc. (2013) 220 Cal.App.4th 701, 724 — 725 [163 Cal.Rptr.3d 415] [failure to adopt policy authorizing meal and rest breaks]; Bradley v. Networkers Internat, LLC (2012) 211 Cal.App.4th 1129, 1150-1151 [150 Cal.Rptr.3d 268] [same].) We express no opinion on this question.
     
      
       Federal courts have observed that the uniform application of an exemption, standing alone, “does nothing to facilitate common proof on the otherwise individualized issues.” (In re Wells Fargo Home Mortgage Overtime Pay Litigation (9th Cir. 2009) 571 F.3d 953, 959.) While we agree with this principle, here the trial court identified additional common issues supporting certification.
     
      
       Title VII of the Civil Rights Act of 1964. (42 U.S.C. § 2000e et seq.)
     
      
       Plaintiffs insist there is substantial evidence to support the trial court’s finding that the BBO position was standardized. We cannot so conclude because the court erroneously excluded all evidence USB offered contradicting this view. Indeed, the court’s own findings belie any notion of uniformity because it found USB had no requirements or expectations regarding where BBOs worked, or how much, so long as they met their sales goals.
     
      
       “Population” is a term of art in statistics, referring to the entire group the sample is intended to represent. Here, the sample consisted of the testifying members of the RWG. The population was the entire class of BBOs.
     
      
       The Dilts plaintiffs ultimately did not introduce statistical evidence at trial, or other evidence establishing a common policy or practice that prohibited off-duty breaks, and the court dismissed the class claims for off-the-clock work. (Dilts v. Penske Logistics, LLC (S.D.Cal., Jan. 21, 2014, No. 08-cv-318-CAB (BLM)) 2014 WL 305039.)
     
      
       At first blush, this degree of cooperation seems puzzling given the other hotly contested aspects of the Bell litigation. But the evolution of the case shows that the interests of both sides were aligned following remand. Once the issue of liability had been decided, both sides benefited from a fair, cost-effective approach to determining damages. By agreeing on a sampling approach, both sides could expedite resolution while preserving their competing interests in calculating damages.
     
      
       The parties disagree about whether standards of proof are relaxed or heightened when a wage and hour class action proceeds solely under the UCL. Citing several false advertising cases, plaintiffs assert that relief under the UCL is generally available without individualized proof of injury. (See In re Tobacco II Cases (2009) 46 Cal.4th 298, 320 [93 Cal.Rptr.3d 559, 207 P.3d 20].) In contrast, USB argues the UCL imposes a stricter proof requirement for restitution than is otherwise required for damages, because restitution is only available for property the defendant wrongfully obtained through an unfair business practice. (See Cortez v. Purolator Air Filtration Products Co. (2000) 23 Cal.4th 163, 177-178 [96 Cal.Rptr.2d 518, 999 P.2d 706].) Thus, they argue, plaintiffs must show they were actually misclassified and are owed unpaid wages for overtime. We need not resolve this issue. Even under a relaxed standard of proof, plaintiffs concede that the restitution award here was subject to such a high margin of error that the judgment cannot stand.
     
      
       In general, the narrower the confidence interval, the more precise the estimate. Confidence in an estimate’s accuracy can only be attained by widening the interval around it. A high confidence level coupled with a small interval indicates that random error in the estimate is low. (See 1 Kaye & Freedman, Modern Scientific Evidence: The Law and Science of Expert Testimony (2010-2011) § 6:34, pp. 361-363.) Conversely, a wide confidence interval, such as the one produced here, shows the estimate is less likely to be accurate.
     
      
       For an explanation of how this relative margin of error was calculated, see ante, at page 20, fn. 13.
     
      
       Ironically, plaintiffs now appear to agree with USB that some individual proceedings are appropriate during phase two. In their reply brief, plaintiffs state, “It is during the damages phase that a defendant in USB’s position has the right to challenge relief to individual class members so long as there is a reasonable basis for doing so.”
     
      
      Retired Associate Justice of the Supreme Court, assigned by the Chief Justice pursuant to article VI, section 6 of the California Constitution.
     