
    TRINITY INDUSTRIES, INC., for Itself and on Behalf of Certain Subsidiaries, Plaintiff-Appellant, v. UNITED STATES of America, Defendant-Appellee.
    No. 12-11012.
    United States Court of Appeals, Fifth Circuit.
    July 2, 2014.
    
      Thomas S. Leatherbury, Esq., Daniel L. Tobey, Esq., John C. Wander, Vinson & Elkins, L.L.P., Dallas, TX, Douglas Scott Draper, Esq., Heller, Draper, Hayden, Patrick & Horn, L.L.C., New Orleans, LA, George Matthew Gerachis, Vinson & El-kins, L.L.P., Houston, TX, for Plaintiff-Appellant.
    Andrew M. Weiner, Richard Bradshaw Farber, Esq., Supervisory Attorney, U.S. Department of Justice, Washington, D.C., Michael D. Powell, U.S. Department of Justice, Dallas, TX, for Defendant-Appel-lee.
    Before OWEN, SOUTHWICK, and GRAVES, Circuit Judges.
   PRISCILLA R. OWEN, Circuit Judge:

Trinity Industries, Inc. (Trinity) designed and built vessels during the taxable years ending March 1994 and March 1995 (the claim years). On its amended tax returns, Trinity claimed research tax credits under Internal Revenue Code (I.R.C.) § 41 based on several of these vessel projects. The I.R.S. denied these claims. Trinity then filed this tax refund action in federal court, seeking research tax credits based on the projects. After a two-phase bench trial, the district court held that the tax credit due Trinity was $135,787.60 for 1994 and $0 for 1995. Trinity now appeals, asserting that it is entitled to a tax credit of $1,808,832.53 for 1994 and $2,712,977.00 for 1995. We affirm in part and vacate and remand in part.

I

Trinity’s amended tax returns for 1994 and 1995 claimed that it was entitled to a research tax credit because its claim year expenses in developing certain vessels constituted qualified research expenses (QREs). As discussed more fully below, 1.R.C. § 41 generally provides a 20% credit for claim year QREs that exceed what the taxpayer spent on research in an earlier comparison period (the base amount). The base amount, in turn, is a “fixed base percentage” multiplied by the company’s average annual gross receipts for the four years preceding the claim year. In calculating its research tax credit, Trinity’s amended tax returns reported a fixed base percentage (the ratio of base period QREs over base period gross receipts) of 1.3152% for the taxable year ending March 1994 and 1.3125% for the year ending March 1995. The tax returns themselves do not report the base period QREs or the base period gross receipts used to calculate the fixed base percentage. The I.R.S. denied these claims in a 2001 claim disal-lowance letter.

Trinity subsequently filed this tax refund action in federal court. Before trial began, Trinity retained James Bennett as an expert. Bennett submitted a report finding that the “consistency rule” under I.R.C. § 41(c)(6) — which requires that claim year QREs and base period QREs be computed on a consistent basis — was satisfied on Trinity’s amended tax returns. Bennett noted only one caveat to this conclusion: the records available for the claim years were more complete than those available for the base period years, so he estimated certain costs for the base period. Based on documentation provided by Trinity, Bennett also provided specific calculations of the base period QREs, the base period gross receipts, and the fixed base percentage. Bennett calculated the overall base period QREs as $49,483,136. Dividing this base period QRE figure by the base period gross receipts ($3,851,683,536) yielded a fixed base percentage of 1.2847%, which was slightly lower than the fixed base percentages reported in the amended tax returns. Although Bennett based his calculations on the same records used to complete the amended tax returns, it is not clear why his fixed base percentage figure was slightly lower.

The district court conducted a two-phase bench trial. In Phase I, the court considered claimed tax credits for six vessel development projects. In its order following Phase I, the court decided that Trinity was wrongly denied credits for only two of the six projects it considered because these two projects (the Mark V and the Dirty Oil Barge) met all four requirements for constituting QREs. According to the court, the other four vessels (the XFPB, the TAGS 60, the Crew Rescue Boat, and the Hurley Dredge) did not meet the fourth QRE requirement: that substantially all of the research activities in developing the project (i.e., 80% or more) were part of a process of experimentation.

In reaching these conclusions, the court explained that the “shrinking-back rule” in the Treasury regulations ordinarily allows taxpayers to show that smaller subcompo-nents of a given project satisfy the process-of-experimentation test even if the entire project does not. For instance, if a whole vessel project does not satisfy the test, perhaps the development of the vessel’s engine is sufficiently experimental. In this case, however, the court noted that “Trinity took an all or nothing approach” because it did not offer proof of its claim year expenses at the subcomponent level. Trinity was unable to offer evidence of its expenses at a more specific level partly because Hurricane Katrina destroyed many of its records. The court thus made its determination on the fourth QRE requirement based on whether, considering each of the six claim year projects as a whole, 80% of the costs incurred in the development of each project were part of a process of experimentation. It did not apply the shrinking-baek rule in analyzing the claim year QREs.

In Phase II of the trial, two other vessel projects (the Queen of New Orleans and the Penn Tugs), as well as the method of calculating Trinity’s base period QREs were at issue. With regard to its base period QRE calculation, Trinity called as a witness Phil Nuss, Trinity’s former Vice President of Engineering. Nuss first confirmed that the ten vessels identified by Bennett in his report were the vessels used in computing the base period QREs on the amended tax returns. Trinity’s counsel then asked Nuss whether he believed expenses related to those ten vessels should still be counted as QREs given the district court’s Phase I order holding that certain claim year vessel expenses were not QREs. Nuss answered that expenses relating to four of the ten base period vessels should no longer be counted. According to Nuss, two of the base period vessels — the LSV and the North Carolina Auto Ferry — were similar to the Hurley Dredge, one of the claim year vessels held not to be qualified research in Phase I, since they all involved Trinity constructing a vessel based on a design provided to Trinity by a third party. In addition, Nuss believed that another base period vessel — the Cajun Queen — was like the Crew Rescue Boat, a claim year vessel held not to be qualified research in Phase I, since the Cajun Queen was also not a complicated technological boat to build. Finally, Nuss testified that a fourth base period vessel — the Ecuador — was like the XFPB, which the district court held was not qualified research in Phase I, since the Ecuador similarly had some experimental features but not enough to satisfy the QRE test. Nuss thus concluded that these four base period vessels should no longer be included in the base period QRE figure, though the other six base period vessels still should be. There was also similar testimony from Sam Charters, Trinity’s former Chief Project Engineer, that the LSV was similar, in terms of the amount of experimentation involved, to the Hurley Dredge held not to constitute qualified research.

After Phase II of the trial concluded, the parties submitted briefing addressing whether the two vessel projects at issue constituted QREs, as well as the proper base period QRE figure under the consistency rule. Trinity made two distinct arguments about the consistency rule. Its first was that it had followed the consistency rule on its amended tax returns by calculating both its claim year QREs and its base period QREs using an all-or-nothing approach. In making this argument, Trinity acknowledged that it did not use the shrinking-back rule in computing its base period QREs on its returns:

On the Amended Returns, Trinity sought a tax credit for only certain vessels that could be considered prototypes. In presenting its claim, Trinity took an “all or nothing” approach — ie., it did not seek shrink-back credit for any subcomponents of any vessels constructed in the Claim Years. To be consistent mth this determination of Claim Year expenditures, Trinity included in the Base Period only expenditures for those vessels that it believed were sufficiently experimental such that the entire vessel constituted qualified research. Consistent with the manner in which it determined QREs for the Claim Years, Trinity did not shrink- back to subcomponents of any vessels in the Base Period.

In taking this position, Trinity was not asking the court to allow it to recalculate its base year QREs; it was simply defending how it originally calculated its base period QREs on its amended tax returns.

Trinity then made a second, distinct argument regarding the consistency rule. Relying primarily on the testimony of Nuss but also citing Charters’s testimony, Trinity asserted it should be able to remove four vessels from its base period QREs as calculated on its amended tax returns, since those vessels were similar, in terms of how much experimentation was involved, to the four vessels held not to be claim year QREs in Phase I:

In Trinity I the Court articulated a different standard for “prototype” than Trinity applied on the Amended Return. The Court’s holding defined the universe of QREs allowable in the Claim Years.... With respect to four of the projects claimed by Trinity, the Court found that the integration of subsystems did not rise to the level required for the cost of developing and constructing the entire vessel to qualify.
Given this standard, pursuant to [the consistency rule] Trinity’s Base Period QREs must be reevaluated in light of the uncontroverted evidence to ensure they are determined consistent with the QREs in the Claim Years. Mr. Nuss ... considered whether any of the vessels originally included in the Base Period would no longer be considered prototypes under the standard articulated in Trinity I. ... Mr. Nuss testified that expenditures on four vessel projects in the Base Period should not be treated as QRE[s],... For each of these vessels, Mr. Nuss testified that the identification, configuration and integration of the components of the vessels were not sufficiently complex for the vessels to constitute prototypes under the Court’s standard.

Trinity contended that, after removing the four base period vessels from the base period QRE figure of $49,483,136 provided by Bennett’s report, the base period QRE figure would total $26,706,987. Trinity’s second consistency rule argument thus asked the district court to allow it to reduce its base period QRE figure by over $20 million, which in turn would reduce its fixed base percentage and increase its overall research tax credit.

In its Phase II order, the district court concluded that Trinity was wrongly denied tax credit for the Queen of New Orleans but was correctly denied credit for the Penn Tugs. According to the court, the Penn Tugs did not meet the fourth QRE requirement. With respect to the consistency rule, the court only addressed the merits of Trinity’s first consistency rule argument — Trinity’s defense of its use of an all-or-nothing, “entire project” approach in both the base period years and the claim years:

Trinity now argues that, in view of the consistency rule, its base period QREs should include only projects that were overall 80% or more research. The Court disagrees.... Here the Court applied the 80% rule only to entire projects due to a lack of evidence to permit application of the “shrinkback rule.” This ... simply reflects the absence of evidence of costs incurred on a subset of an entire vessel.... Accordingly, the Court will not exclude from the base period QREs any QREs incurred that were less than 80% of an entire project.

The court accordingly rejected the merits of Trinity’s first argument but failed to address the merits of Trinity’s second argument based on Nuss’s testimony. The court also failed to acknowledge that Trinity did not use the shrinking-back rule in calculating the base period QREs in its amended tax returns, and therefore the base period QREs already excluded any QREs incurred that were less than 80% of an entire project. The same order provided that “[o]ther than Trinity’s consistency argument, the Court credits the testimony of Trinity’s witnesses and finds Trinity’s calculation to be a reasonable estimate of the base amount.”

After additional briefing, the district court issued its final judgment holding that Trinity was entitled to $135,787.60 in tax credit for 1994 and $0 for 1995. Trinity now appeals.

II

“The standard of review for a bench trial is well established: findings of fact are reviewed for clear error and legal issues are reviewed de novo.” A fact finding “is clearly erroneous if it is without substantial evidence to support it, the court misinterpreted the effect of the evidence, or this court is convinced that the findings are against the preponderance of credible testimony.” Mixed questions of law and fact are also reviewed de novo.

Ill

Trinity first contends that the district court erred in applying the consistency rule in calculating its base period QREs. We begin with an overview of the research tax credit calculation and the consistency rule, and then consider Trinity’s position regarding the consistency rule.

A

Under I.R.C. § 41, companies can claim a 20% credit for QREs that exceed what they spent in an earlier comparison period. Using the terminology of the I.R.C., the research tax credit is calculated as follows:

• Research credit = the lesser of: 20% x (claim year QREs — base amount), and 20% x (50% x claim year QREs);

• Base amount = fixed base percentage x average annual gross receipts for the four years preceding claim year;

• Fixed base percentage = aggregate base period QREs/aggregate base period gross receipts, where base period here = taxable year ending 3/31/1985 through taxable year ending 3/31/1989.

The research tax credit can be summarized in the following formula:

To constitute QREs, four requirements must be met:

(1) the expenses must be of the type deductible under I.R.C. § 174; (2) the research must be undertaken “for the purpose of discovering information ... which is technological in nature;” (3) the application of that information must be “intended to be useful in the development of a new or improved business component of the taxpayer;” and (4) substantially all of the research activities must “constitute elements of a process of experimentation.”

Under Treasury regulations, the fourth requirement is met if 80% or more of the research activities constitute elements of a process of experimentation.

This four-part QRE test is to be applied separately to each business component of the taxpayer, which is defined to include any product held for sale, lease, or license, or used by the taxpayer in its trade or business. If, however, each of the four requirements is not met with respect to an entire business component, the shrinking-back rule under Treasury Regulation § 1.41^á(b)(2) is implicated. Under the shrinking-back rule, these four requirements “are to be applied first at the level of the discrete business component,” but “[i]f these requirements are not met at that level, then they apply at the most significant subset of elements of the product.” “This shrinking back of the product is to continue until either a subset of elements of the product that satisfies the requirements is reached, or the most basic element of the product is reached and such element fails to satisfy the test.” Accordingly, the rule “is applied only if a taxpayer does not satisfy the [four requirements] with respect to the overall business component.”

The consistency rule also plays a role in computing QREs. The rule ensures that the research tax credit due is not overstated or understated because the taxpayer inconsistently compares QREs in the base period years and the claim year. The rule provides that “the [QREs] taken into account in computing [the fixed base] percentage shall be determined on a basis consistent with the determination of [QREs] for the credit year.” The corresponding Treasury regulation uses similar language.

B

On appeal, Trinity challenges the district court’s application of the consistency rule in calculating its base period QREs, and it presents two distinct arguments as to why the rule was violated. The first argument is that the district court violated the consistency rule by applying the shrinking-back rule in the base period years but not applying it in the claim years. This argument hinges on the assumption that the district court used a shrink-back analysis for the base period QREs, so that if expenses related to the entire vessel did not constitute qualified research, expenses related to subcomponents of the vessel were still counted as qualified research. The Government has not contested Trinity’s understanding of the facts, so the parties argue over whether, as a legal matter, when the shrinking-back rule could not be applied in calculating the claim year QREs because of a lack of evidence, the base period QREs also had to be calculated on an all-or-nothing, “whole ship” basis.

There is a significant issue, however, regarding the facts assumed in much of the briefing in our court: the briefs assumed that the shrinking-back rule was applied by the district court, while the record reflects that it was not applied in the district court’s calculation of the base period QREs. In fact, neither Trinity’s amended tax returns, nor Bennett’s report, nor the district court judgment based on Bennett’s report used the shrinking-back rule in computing Trinity’s base period QREs. Trinity’s briefing before the district court acknowledged that it did not use the shrinking-back rule in its amended tax returns. Moreover, Bennett’s report confirms that neither Trinity’s amended tax returns nor his report could have used the shrinking-back rule in computing Trinity’s base period QREs. For one, Bennett’s report stated that he calculated the base period QREs “based on 100% of the wages, supplies and contract labor costs, less overhead costs” for the ten vessel projects in the base period. Including all of the projects’ expenses shows that entire projects were considered qualified research, not just “subcomponents” of the projects like the vessels’ engines. Second, Bennett noted that Trinity’s records for the base period years were even less detailed than the records for the claim years, in which Trinity was unable to avail itself of the shrinking-back rule because records were lacking. This further establishes that neither Bennett’s report nor Trinity’s amended returns applied the shrinking-back rule in the base period.

The district court’s order also did not employ the shrinking-back rule in computing the base period QREs. In its final judgment, the district court held Trinity was entitled to $135,787.60 for 1994 and $0 for 1995. These figures are the same as those in Trinity’s post-trial briefing, which asked the court to adopt the base amount figures from Bennett’s report. This means that the base amounts used in the final judgment were calculated using the fixed base percentage, including the base period QREs, used in Bennett’s report, and Bennett’s report did not employ the shrinking-back rule in calculating the base period QREs. As a result, the district court’s judgment incorporated a base period QRE figure calculated using an all-or-nothing approach, thereby excluding any QREs incurred that were less than 80% of an entire project, even though it held that it would “not exclude from the base period QREs any QREs incurred that were less than 80% of an entire project.” Trinity’s attorney at oral argument acknowledged that the district court did not apply a shrink-back analysis in calculating the base period QREs.

In short, the district court did not use the shrinking-back rule in calculating Trinity’s base period QREs. Trinity argues that the district court should have applied a “whole ship” methodology in calculating its base period QREs, but this is exactly what the district court did by calculating the base amount based on Bennett’s report. Moreover, any argument-from either party — about whether the shrinking-back rule should have been applied in the base period years need not be addressed as the lack of records from the base period years apparently would have precluded the shrinking-back rule from being applied in the base period. Trinity’s first consistency rule argument therefore entitles it to nothing more than what the district court awarded it.

C

Trinity presents a second argument as to why the consistency rule was violated and why it is therefore entitled to a lower base period QRE figure of $26,706,987. The $26,706,987 amount is the base period QRE used in the Bennett report less the QREs attributable to the four base period vessels Nuss testified would not constitute qualified research under the standard articulated by the district court in Phase I of trial.

Nuss testified that he did not believe that four base period vessels — the LSV, the North Carolina Auto Ferry, the Cajun Queen, and the Ecuador — involved the level of experimentation necessary to constitute QREs since they were similar, in terms of how much experimentation was involved, to the claim year vessels the district court held not to be QREs after Phase I. In other words, if the district court found that certain claim year vessels — including the XFPB, the T-AGS 60, the Crew Rescue Boat, and the Hurley Dredge — did not satisfy the QRE test, then four of the base period vessels should not have satisfied the QRE test, either. As discussed above, Nuss testified that two of the base period vessels — the LSV and the North Carolina Auto Ferry — were similar to the Hurley Dredge, since they all involved Trinity constructing a vessel based on a design provided to Trinity by a third party. There was also testimony from Charters that the LSV was comparable to the Hurley Dredge since both were based in part on an existing design. Nuss believed that another base period vessel— the Cajun Queen — was like the Crew Rescue Boat, another claim year vessel held not to be qualified research, since the Cajun Queen was also not a complicated technological boat to build. Finally, Nuss testified that a fourth base period vessel — the Ecuador — was like the XFPB, which the district court held was not qualified research, since the Ecuador similarly had some experimental features but not enough to satisfy the QRE test. Nuss testified that to calculate the QREs on a consistent basis, the four base period vessels should be removed from the base period QRE calculation.

Though much of Trinity’s briefing on appeal focused on its first consistency rule argument concerning the shrinking-back rule, Trinity has adequately raised its second argument based on Nuss’s testimony. In its opening brief, Trinity argued that, under a proper application of the consistency rule, this court should calculate its base period QREs as $26,706,987: the base period QRE amount used in the Bennett report less the QREs attributable to the four vessels Nuss said would not satisfy the district court’s Phase I QRE standard. Trinity’s reply brief again referenced Nuss’s testimony, noting that “[biased on the District Court’s ruling in Phase I, [Nuss and Charters] identified four comparable vessels, out of the [ten] submitted by Trinity in the base years, that similarly would not qualify for research credit as whole ships under the Court’s analysis.” Trinity also contended that the panel could reverse and render judgment in its favor based on “Nuss’s testimony of why four ships from the base years ... should be removed under the consistency rule” and referred to “Nuss’s and Charters’s testimony on removing four partial ships from the base years.” At oral argument, Trinity’s counsel again urged that if the district court disqualified certain claim year vessels, it should have also disqualified the four base period vessels, which involved a similar level of experimentation, as not satisfying the process-of-experimentation test. Trinity therefore sufficiently raised this argument on appeal.

Assuming the district court would credit Nuss’s and Charters’s testimony, we agree with Trinity that if certain base period vessels are just as experimental as claim-year vessels held not to be qualified research, those base period vessels should not be counted as qualified research for purposes of the base period QRE calculation. I.R.C. § 41 allows a taxpayer to claim a tax credit for claim year research expenses that exceed the research expenses spent in an earlier comparison period, the base period years. To equitably measure the increase in qualified research spending between the two periods, the same standard should be applied in determining whether certain projects pursued in the two periods are sufficiently experimental to be qualified research.

The consistency rule addresses this very issue: it aims to ensure that the research tax credit due is not overstated or understated because the taxpayer inconsistently compares QREs in the base period years and the claim year. The rule provides that the QREs “taken into account in computing [the fixed-base] percentage shall be determined on a basis consistent with the determination of [QREs] for the credit year.” The corresponding Treasury regulation uses language very similar to the consistency rule itself but adds the word “definition”:

[QREs] ... taken into account in computing a taxpayer’s fixed-base percentage and a taxpayer’s base amount must be determined on a basis consistent with the definition of [QREs] ... for the credit year, without regard to the law in effect for the taxable years taken into account in computing the fixed-base percentage or the base amount.

The regulation then provides two illustrations. In the first, for a hypothetical taxpayer, the statutory definition of QRE for its first two base period years (1984 and 1985) is different than a revised QRE definition applicable in its last three base period years (1986, 1987, and 1988) and in its claim year (2001). To compute the credit for 2001, the taxpayer must apply the new QRE definition to its 1984 and 1985 base period years “to reflect the change in the definition of qualified research” beginning in 1986. The second illustration following the regulation shows another slightly different application of the consistency rule. It states that if the taxpayer counts “a certain type of expenditure” — such as the wages of its research assistants — as QREs in the claim year, it must also count “similar expenditures” as QREs in its base period years. In sum, the consistency rule calls for consistent application of the QRE definition across the base period years and the claim year, including the types of expenditures the taxpayer treats as QREs.

The consistency rule is equally applicable to a case like this one. Here, the district court decided that certain claim year projects were not sufficiently experimental to pass the fourth QRE requirement — that 80% or more of the research activities involved in the project constitute elements of a process of experimentation —and Trinity simply asked the court to consider whether four of its base period projects were also not sufficiently experimental to pass that same test. If, for instance, the Ecuador and the XFPB projects involved exactly the same level of experimentation, then it would violate the consistency rule (and understate Trinity’s tax credit) not to count the XFPB as qualified research in the claim year but to count the Ecuador as qualified research in the base period years. Trinity is entitled to have a consistent QRE test applied to projects in the base period years and the claim years.

The only matter still unsettled is whether the district court would credit Nuss’s and Charters’s testimony that the four base period vessels were similar to the four claim year vessels held not be qualified research, in terms of how much experimentation was involved. In its order after Phase II of the trial, the district court stated that, “[o]ther than Trinity’s consistency argument, the Court credits the testimony of Trinity’s witnesses and finds Trinity’s calculation to be a reasonable estimate of the base amount.” But again, though Trinity raised in the district court the second consistency rule argument based on Nuss’s and Charters’s testimony, the district court did not address this issue, so it is unclear whether the district court credited those witnesses’ testimony on the four base period vessels.

We therefore remand to the district court for a limited purpose: making a factual finding as to whether to credit the testimony of Nuss and Charters that the four base period vessels were as experimental as (or less experimental than) the four claim year vessels held not to satisfy the fourth QRE requirement. If the district court credits this testimony against any possible conflicting testimony or evidence, then those four base period vessels should be removed from the base period QRE calculation, and the resulting base period QRE figure would be $26,706,987. If the district court finds that the four base period vessels (or some of them) were more experimental than the four claim year vessels and were sufficiently experimental to qualify as QREs, then the base period QRE figure should include the expenses associated with those vessel projects. We thus vacate the district court’s holding as to the consistency rule and remand for findings as to whether, in light of the district court’s Phase I order, the four base period vessels at issue are sufficiently experimental to constitute qualified research.

IV

Trinity also challenges the district court’s conclusion that its research expenses in developing the Penn Tugs did not satisfy the fourth QRE requirement and therefore did not constitute claim year QREs. Trinity first argues that because the I.R.S. already determined that the Penn Tugs met the fourth QRE requirement, the Government was either conclusively or presumptively bound by that decision in this litigation. When the I.R.S. issued its claim disallowance letter, it incorporated a revenue agent report stating that, with respect to the Penn Tugs, the fourth requirement — the process-of-experimentation test — was satisfied, even though two other QRE requirements were not. Even if the Government is not conclusively bound by the conclusion in Trinity’s favor on the fourth QRE requirement, Trinity argues it is entitled to a presumption of correctness, so the Government bore the burden of rebutting it by affirmative evidence.

We disagree. The district court correctly held that the report’s conclusion, though admissible evidence, was neither binding nor entitled to a presumption of correctness. In tax refund actions, the district court reviews de novo the Commissioner’s decision regarding a taxpayer’s tax liability. The taxpayer has the burden of proving by a preponderance of the evidence that the Commissioner’s assessment — its final determination of the taxpayer’s liability — was erroneous, since the assessment is presumed to be cor-reet. While the assessment itself is entitled to a presumption of correctness, the “reasons for the Commissioner’s determination are not relevant for the Court does not review those reasons.” Accordingly, here, while the I.R.S.’s ultimate determination of Trinity’s tax liability is presumptively correct, the revenue agent report’s subsidiary conclusion that the Penn Tugs met the process-of-experimentation test is neither binding on the Government nor presumptively correct.

Trinity’s second contention is that, even if the conclusions in the revenue agent report are not binding or presumptively correct in this litigation, the district court erred in deciding that the Penn Tugs did not satisfy the process-of-experimentation test under the fourth QRE requirement. Under this requirement, substantially all (i.e., 80%) of the taxpayer’s research activities, “measured on a cost or other consistently applied reasonable basis,” must constitute elements of a “process of experimentation. As the Treasury regulations elaborate, a process of experimentation”

involves three steps: (1) “the identification of uncertainty concerning the development or improvement of a business component,” (2) “the identification of one or more alternatives intended to eliminate that uncertainty,” and (3) “the identification and the conduct of a process of evaluating the alternatives (through, for example, modeling, simulation, or a systematic trial and error methodology).”

Case law applying the Treasury regulations explains that the process-of-experimentation test is not satisfied when the taxpayer uses “a method of simple trial and error to validate that a process or product change meets the taxpayer’s needs.” Instead, at a minimum, the research activities must involve a “systematic trial and error methodology” — “a methodical plan involving a series of trials to test a hypothesis, analyze the data, refine the hypothesis, and retest the hypothesis so that it constitutes experimentation in the scientific sense.”

The Penn Tugs were each designed to connect to a barge by an articulating arm (the “Intercon coupler”) so that the tugboat and the barge could function as a single ship with high horsepower. In this tug-coupler-barge combination, or articulated tug barge (ATB), “the tug functioned as the propeller for the barge, rather than as a traditional tug that pulls another vessel.” Trinity designed the Penn Tugs to avoid the Coast Guard’s requirement that large cargo ships have a 20-person crew with advanced licenses. Instead of treating the ATB as a cargo ship, the Coast Guard would certify the vessel as a tug and a barge, so it could be manned by a smaller crew with simpler licenses, resulting in cost savings for the operator.

In addressing the Penn Tugs, the district court first noted that the Intercon coupler part of the ATB was invented, developed, and built by other companies. The court then considered the first Penn Tug and noted that, in designing it, Trinity modified an existing hull design. The court acknowledged this modification involved assessing “the structural integrity of the tug-coupler combination” since “portions of the tug can be ‘hanging’ from the Intercon coupler, depending on wave action on the barge.” The court nonetheless concluded that less than 80% of the costs were incurred in a process of experimentation “because third parties designed and manufactured the coupler system itself and the first Penn Tug was only a modification of an existing design.”

The district court then separately addressed the second Penn Tug. Trinity made the elevated pilot house on the first Penn Tug out of aluminum to save weight, but due to changes in Coast Guard fire regulations, Trinity was required to make the pilot house on the second Penn Tug out of steel. Although the heavier steel pilot house required other design changes, the district decided that “the only process of experimentation between the first and second Penn Tugs was the change to the composition of the pilot house.” The district court concluded that the second Penn Tug also failed the process-of-experimentation test.

This determination on the Penn Tugs is best characterized as a mixed question of law and fact, since the underlying facts and applicable legal standard are basically undisputed, and the question is whether those facts satisfy the legal standard. We therefore review any underlying fact findings for clear error, but we review the legal conclusion that the Penn Tugs failed the process-of-experimentation test de novo. In addition, Trinity had the burden of proof, including the burden of going forward with evidence and the burden of persuasion, that 80% of its Penn Tugs research activities constituted a process of experimentation.

Trinity has not established that 80% of the research activities for the Penn Tugs constituted elements of a process of experimentation. Trinity argues that modifying the existing hull design involved some systematic trial and error — “[finding a working hull design required testing and retesting, including computer modeling and at least four design revisions” — and has cited testimony to support this assertion. However, Trinity has failed to establish which other aspects of its research involved the sort of systematic trial and error required under the proeess-of-experimentation test. For instance, in arguing that changing the composition of the pilot house involved a process of experimentation, Trinity says it “experimented” with an aluminum tower and pilothouse but does not allege that designing the tower involved the sort of systematic trial and error that the hull design involved. While designing the hull involved some systematic experimentation, Trinity has not shown that substantially all of its Penn Tugs research activities involved the type of systematic experimentation courts have required.

Moreover, since the coupler was not designed by Trinity, the district court correctly focused on research activities with respect to the tugs only. Trinity does not dispute that it did not invent, design, or build the coupler; it only claims it “had to work alongside the company manufacturing the coupler to design the support structure required for it to work within this prototype.” While Trinity had to consider how the tugs would support the coupler, it did not design the coupler itself.

The district court also did not err in finding that, even if aspects of designing the hull involved a process of experimentation, the hull’s design was still based on an existing hull design to some extent. Although there was testimony that Trinity had “to start over from scratch with a different hull,” the testimony also established that designing the new hull involved “modifications” of old hulls, and that the ultimate hull design had similarities to existing hulls. Additionally, it was not error to consider the two Penn Tugs “separately and sequentially,” as Trinity argues. Trinity cites no authority requiring the court to treat the tugs together, and even if Trinity had, Trinity has not shown how the 80% threshold would be surpassed by doing so. Accordingly, the district court did not err in its analysis of the Penn Tugs.

* * *

For the foregoing reasons, the judgment of the district court is AFFIRMED in part, and VACATED and REMANDED in part for proceedings consistent with this opinion. 
      
      . I.R.C. § 41(a)(1) (codified at 26 U.S.C. § 41(a)(1)).
     
      
      . Id. § 41(c)(1).
     
      
      . Id. § 41(c)(3)(A).
     
      
      . Id. § 41(c)(6)(A) ("[T]he [QREs] taken into account in computing [the fixed base] percentage shall be determined on a basis consistent with the determination of [QREs] for the credit year.”).
     
      
      . See id. § 41(d)(1) (providing that to constitute QREs, (1) the expenses must be deductible under I.R.C. § 174; (2) the research must be for the purpose of discovering technological information; (3) the application of that information must be intended to be useful in the development of a new or improved business component; and (4) substantially all of the research must constitute elements of a process of experimentation).
     
      
      . Id. § 41(d)(1)(C); Treas. Reg. § 1.41-4(a)(6) (codified at 26 C.F.R. § 1.41-4(a)(6)).
     
      
      . See Treas. Reg. § 1.41-4(b)(2).
     
      
      . Coe v. Chesapeake Exploration, L.L.C., 695 F.3d 311, 316 (5th Cir.2012) (quoting Preston Exploration Co. v. GSF, L.L.C., 669 F.3d 518, 522 (5th Cir.2012)).
     
      
      . Petrohawk Props., L.P. v. Chesapeake La., L.P., 689 F.3d 380, 388 (5th Cir.2012) (quoting French v. Allstate Indem. Co., 637 F.3d 571, 577 (5th Cir.2011)).
     
      
      . Id. (citing Dickerson v. Lexington Ins. Co., 556 F.3d 290, 294 (5th Cir.2009)).
     
      
      . I.R.C. § 41(a)(1).
     
      
      . Id. § 41(a)(1), (c)(2).
     
      
      . Id. § 41(c)(1).
     
      
      . See id. § 41(c)(3)(A).
     
      
      . United States v. McFerrin, 570 F.3d 672, 676 (5th Cir.2009) (alteration in original) (quoting I.R.C. § 41(d)(1)).
     
      
      . Treas. Reg. § 1.41-4(a)(6).
     
      
      . I.R.C. § 41(d)(2).
     
      
      . Treas. Reg. § 1.41-4(b)(2).
     
      
      . Id.
      
     
      
      . Id.
      
     
      
      . I.R.C. § 41(c)(6)(A).,
     
      
      . Treas. Reg. § 1.4I-3(d)(l) ("[QREs] and gross receipts taken into account in computing a taxpayer’s fixed-base percentage and a taxpayer’s base amount must be determined on a basis consistent with the definition of [QREs] and gross receipts for the credit year, without regard to the law in effect for the taxable years taken into account in computing the fixed-base percentage or the base amount.”).
     
      
      . I.R.C. § 41(a)(1).
     
      
      . Id. § 41(c)(6)(A).
     
      
      . Treas. Reg. § 1.41 — 3(d)(1).
     
      
      . Id. § 1.41-3(d)(2) (Example 1).
     
      
      . Id.
      
     
      
      . Id. (Example 2); see also Union Carbide Corp. v. Comm’r, 97 T.C.M. (CCH) 1207, T.C. Memo. 2009-50, at *75 (2009) ("[T]he taxpayer must include the same types of activities from the credit year and the base period when identifying qualified research activities and include the same types of costs as QREs for the credit year and the base period.”), ajfd, 697 F.3d 104 (2d Cir.2012).
     
      
      . See I.R.C. § 41(d)(1); Treas. Reg. § 1.41-4(a)(6).
     
      
      . See Fed.R.Evid. 801(d)(2)(D).
     
      
      . See Clapp v. Comm’r, 875 F.2d 1396, 1403 (9th Cir.1989) ("In tax cases ... [the] United States District Court review[s] the Commissioner's decision on the merits de novo.")-, Int’l Paper Co. v. United States, 36 Fed.Cl. 313, 320 (1996) (recognizing the “de novo nature of tax refund proceedings in the Court of Federal Claims (as well as in the district courts)”).
     
      
      . Carson v. United States, 560 F.2d 693, 695-96 (5th Cir. 1977) ("The burden and the presumption, which are for the most part but the opposite sides of a single coin, combine to require the taxpayer always to prove by a preponderance of the evidence that the Commissioner’s determination was erroneous.”) (citing United States v. Janis, 428 U.S. 433, 440, 96 S.Ct. 3021, 49 L.Ed.2d 1046 (1976)); see also Cook v. United States, 46 Fed.Cl. 110, 113-14 (2000) (discussing the presumption of correctness and the taxpayer's burden in tax refund suits).
     
      
      . Int’l Paper Co., 36 Fed.Cl. at 320 (quoting Pierson v. United States, 428 F.Supp. 384, 390 (D.Del.1977)); see also Michael I. Saltzman & Leslie Book, IRS Practice and Procedure ¶ 1.05[2][a] (explaining that in a tax refund suit in federal court, "any record’ made in the Service, including the reasons for its assessment, is irrelevant” and that the "action involves a de novo determination of the correct tax and is not a review of the administrative processing of the case”).
     
      
      . I.R.C. § 41(d)(1)(C); Treas. Reg. § 1.41-4(a)(6).
     
      
      . United. States v. McFerrin, 570 F.3d 672, 677 (5th Cir.2009) (quoting Treas. Reg. § 1.41-4(a)(5)(i)).
     
      
      . Union Carbide Corp. v. Comm’r, 97 T.C.M. (CCH) 1207, T.C. Memo. 2009-50, at *81 (2009), aff'd, 697 F.3d 104 (2d Cir.2012); see United States v. Davenport, 897 F.Supp.2d 496, 506 (N.D.Tex.2012) (same).
     
      
      . Treas. Reg. § 1.41-4(a)(5)(i).
     
      
      . Union Carbide Corp., T.C. Memo. 2009-50, at *81; see Davenport, 897 F.Supp.2d at 506 (same).
     
      
      . See Ornelas v. United States, 517 U.S. 690, 696-97, 116 S.Ct. 1657, 134 L.Ed.2d 911 (1996) (explaining that for mixed questions of law and fact, "historical facts are admitted or established, the rule of law is undisputed, and the issue is whether the facts satisfy the [relevant statutory standard]”) (internal quotation marks and citation omitted).
     
      
      . See Dunn v. Comm’r, 301 F.3d 339, 348 (5th Cir.2002) (holding that the “determination of fair market value” of common stock under Treasury regulations is a mixed question of fact and law, and that the factual premises of mixed questions are reviewed for clear error and the legal conclusions are reviewed de novo).
     
      
      . Int’l Paper Co. v. United States, 36 Fed.Cl. 313, 322 (1996).
     
      
      
        .See Union Carbide Corp., T.C. Memo. 2009-50, at *7, *85-86 (holding that substantially all of the research activities in connection with an "anticoking project” (a project aimed at reducing the build-up of carbon in the taxpayer’s furnaces) constituted elements of a process of experimentation because the project involved collecting and analyzing data to compare existing technologies with alternatives and ultimately refining the process to improve it overall); see also Davenport, 897 F.Supp.2d at 514 — 15 (holding that the taxpayer’s project "did not involve a process of experimentation or the type of systematic plan involving a series of trials to test a hypothesis, analyze the data, refine the hypothesis, and retest the hypothesis; nor did it involve a series of experiments with one or more alternatives to develop the [project]”) (internal quotation marks and citation omitted).
     