
    UNITED STATES of America, Plaintiff-Appellee, v. Danilo GARCIA, a/k/a Donny, a/k/a Darreo, Defendant-Appellant.
    No. 13-4136.
    United States Court of Appeals, Fourth Circuit.
    Argued: March 20, 2014.
    Decided: May 15, 2014.
    
      ARGUED: Todd Michael Brooks, Whiteford Taylor & Preston LLP, Baltimore, Maryland, for Appellant. Ayn Bri-goli Ducao, Office of the United States Attorney, Baltimore, Maryland, for Appel-lee. ON BRIEF: Erek L. Barron, White-ford Taylor & Preston LLP, Bethesda, Maryland, for Appellant. Rod J. Rosen-stein, United States Attorney, Office of the United States Attorney, Baltimore, Maryland, for Appellee.
    Before GREGORY and KEENAN, Circuit Judges, and DAVIS, Senior Circuit Judge.
   Vacated and remanded by published opinion. Senior Judge DAVIS wrote the opinion, in which Judge GREGORY and Judge KEENAN joined.

DAVIS, Senior Circuit Judge:

Appellant Danilo Garcia, one of fourteen defendants named in a superseding indictment returned by the grand jury in the District of Maryland, appeals his convictions on five counts of narcotics trafficking. After a severance, Garcia proceeded to trial with two codefendants.

The Government’s presentation at trial consisted primarily of two types of evidence: (1) law enforcement testimony recounting observations (and describing a few drug seizures) resulting from close surveillance of the physical comings and goings of numerous coconspirators and targets, including drug couriers; and (2) audio recordings of wiretapped mobile telephone conversations concerning drug supplies, deliveries, and payments therefor. Over defendants’ vigorous objections before and during trial, the district court permitted an agent of the Federal Bureau of Investigation to testify as an expert on coded drug-related conversations. The agent also testified as a fact witness regarding aspects of the lengthy investigation.

The jury convicted Garcia of drug conspiracy and of the four substantive counts in which he was named, but it was unable to reach a unanimous verdict as to his co-defendants and the court declared a mistrial on those charges.

On appeal, Garcia assigns error to the district court’s admission of the decoding expert’s testimony, and the court’s denial of his motion for judgment of acquittal for lack of sufficient evidence as to one of the substantive counts on which he was convicted.

Upon our careful review of the record, we hold that the district court abused its discretion in its evidentiary rulings as viewed in their totality. Specifically, we hold that, on this record, safeguards adopted by the district court to avoid the substantial risk of prejudice inhering in the jury’s receipt of the decoding expert’s testimony were inadequate. Garcia timely and repeatedly objected regarding the foundational sufficiency and methodological reliability of the agent’s expert testimony, and he specifically pointed to the risk of prejudice arising from the agent’s dual capacity as both an expert and fact witness. We are persuaded that, under the circumstances of this case, neither the district court’s cautionary instructions to the jury nor its sporadic sustaining of some of counsels’ objections adequately mitigated the risk of substantial prejudice. Furthermore, we are unable to conclude that the missteps evident in this record were harmless. We reach this conclusion reluctantly, because the district court tried mightily to hew to the lines we have drawn in prior cases.

Accordingly, although we discern no reversible error in the court’s denial of the motion for judgment of acquittal, we hold that the errors in the decoding expert’s testimony so infected the entire trial that we must vacate the judgment and remand the case to the district court for further proceedings.

I.

A.

On August 23, 2012, the grand jury returned a ten-count superseding indictment against Garcia and thirteen others. Garcia was named in the following five counts: Count One, conspiracy to distribute more than one kilogram or more of heroin, in violation of 21 U.S.C. §§ 841(a)(1), 846, from April 2009 to November 2011; Count Four, possession with intent to distribute heroin on April 19, 2011 in violation of 21 U.S.C. § 841(a)(1); Count Five, possession with intent to distribute 100 grams or more of heroin on July 22, 2011 in violation of 21 U.S.C. § 841(a)(1); Count Six, possession with intent to distribute 100 grams or more of heroin on August 5, 2011 in violation of 21 U.S.C. § 841(a)(1); and Count Seven, possession with intent to distribute 100 grams or more of heroin on April 17, 2009 in violation of 21 U.S.C. § 841(a)(1).

B.

At the times relevant to this case, Garcia lived in New York City. The basic theory of the prosecution was straightforward: (1) coconspirator Yoni Rodriguez, who pled guilty in the Eastern District of New York on narcotics charges and testified against Garcia pursuant to a plea agreement, was Garcia’s source of uncut heroin starting sometime in 2005, selling to Garcia every few weeks or so, usually on consignment; (2) Garcia would distribute the heroin in Baltimore, after either transporting the drugs himself or via drug couriers from New York; (3) Roy Lee Clay and Walter Lee Powell were Baltimore-based middlemen who distributed quantities of heroin to other dealers in the Baltimore area.

In addition to the charge of knowing participation in the thirty-month drug trafficking conspiracy, Garcia was charged with four specific instances of possession with intent to distribute heroin, the circumstances surrounding which we summarize as follows:

On April 17, 2009, a police officer observed Garcia exiting a white shuttle bus in the parking lot of a Baltimore travel plaza, carrying a brown paper bag. After Garcia saw the police officer, he reentered the bus, left the bag behind, and disclaimed any knowledge of or interest in the bag. The officer found a manicure set inside the bag, and inside the manicure set he discovered approximately 200 grams of heroin.
On April 19, 2011, police observed Powell enter Kenya Salik Montgomery’s car and exiting after about two minutes. When, about two hours later, the police searched Montgomery, they found multiple baggies of heroin.
On July 22, 2011, Garcia spoke by phone with Powell and instructed Powell to travel to Philadelphia. The FBI observed Powell when he arrived in Philadelphia and visited a certain neighborhood. After remaining in Philadelphia a mere fifteen minutes, Powell returned to Maryland (followed by law enforcement). A Maryland State Police trooper conducted a traffic stop upon Powell’s entry into the state and the trooper recovered 143.7 grams of heroin from Powell.
On August 5, 2011, FBI agents observed Nancy Feliciano at a travel plaza in Baltimore, leaving a van and getting into a cab. As she was getting into the cab, the agents approached her and searched her bags (with her consent). They found approximately 500 grams of heroin and an index card which had Powell’s address and phone number. Later, while she was being detained in the custody of the FBI, Feliciano made a monitored call to Garcia and accused him of providing her with heroin.

During the trial, many members of law enforcement, including FBI agents, U.S. Drug Enforcement Agency agents, and state and local law enforcement officers, testified regarding their surveillance and observations of the defendants together in and around Baltimore, as well as in New York and once in Philadelphia. Critically, much of the investigation entailed the maintenance of wiretaps on four mobile phones used by Garcia, two used by Clay, and two used by Powell. To a significant extent, coded telephone conversations could be aligned with the physical activities of the coconspirators.

C.

On August 30, 2012, the Government provided notice to the defendants of its intent to call FBI Special Agent Carrie Dayton as an expert. The Government provided Agent Dayton’s resumé, and stated that she “served as a monitor on the wiretaps, and will testify as to the meaning of coded references in several of the calls used by the conspirators, when discussing drug trafficking over the phone.” J.A. 38. The Government’s disclosure showed that Agent Dayton had worked as a paralegal and then, for more than six years, as an attorney, before joining the FBI in May 2004. She thus had eight years of law enforcement experience by the time of trial. She initially worked in the FBI’s Civil Rights and White Collar Crime divisions, and joined the Safe Streets Task Force in November 2007, working narcotics and violent crime investigations.

Agent Dayton had “observed and [ ] recognize[d] patterns of behavior consistent with narcotics trafficking” as a result of her participation in over 20 investigations. J.A. 57. Agent Dayton was also involved in over five wiretap investigations, during which she “listened to thousands of phone conversations between individuals who participated in or were suspected of’ drug trafficking conspiracies and other crimes. Id. Agent Dayton attested that she has “gained a working knowledge of drug language and terminology commonly used by drug dealers” and that she was “aware of the technique and [has] heard hundreds of conversations in which narcotics traffickers employ code words to disguise discussions” about drugs. Id. Agent Dayton had not previously testified as an expert on coded drug language.

Days before the commencement of trial, defendants moved to exclude the testimony of Agent Dayton on the basis that the Government failed to provide a sufficient expert disclosure in conformity with Federal Rule of Criminal Procedure 16(a)(1)(G). On October 2, 2012, after opening statements, the district court denied the motion. Referring to whether the Government had provided a sufficient outline of Agent Dayton’s proposed testimony, the court stated that,

it appears to me that when a disclosure says she is going to testify to coded references in transcripts of calls related to narcotics, clearly, based on the openings, counsel have understood the government’s theory to be that references to things like cars and perhaps girls are in fact references to drugs, that it does not take any great additional leap of inference from there to figure out what Special Agent Dayton is going to testify to.

J.A. 177-78. Accordingly, the district court held that there was no prejudice to the defense “even if there was something deficient in the disclosure.” J.A. 178.

D.

On eighteen separate occasions during six days over the two week trial, the Government called and re-called Agent Dayton to the stand to testify on the coded language used in recorded wiretapped calls between the conspirators. From the start, defense counsel voiced vigorous and repeated objections to Agent Dayton’s qualifications as an expert witness in decoding drug traffickers’ language. Counsels’ objections continued as the presentation of her testimony unfolded, specifically based on Agent Dayton’s failure to conform to recognized methodologies for such testimony and the Government’s failure to explicitly demonstrate the existence of an acceptable foundation for many of her specific interpretations. The district court generally overruled the defense objections (while sustaining some), holding that Agent Dayton had “sufficient experience to meet the standard to be able to offer opinion testimony that might be helpful to the jury based on ... her five years of experience, dozens and dozens of shifts on wiretaps, and thousands and thousands of calls, as well as her conversations with both cooperators, witnesses, and other law enforcement officers.” J.A. 320.

Manifesting a deep familiarity with this Circuit’s settled guideposts with regard to this type of testimony, the experienced judge identified early on two potential problems with Agent Dayton’s testimony: (1) the need to distinguish between her lay fact testimony based on her personal knowledge, on the one hand, and her expert opinion testimony based on her training and investigatory experience, on the other; and (2) ensuring that she was testifying on the basis of her experience and expertise in coded language, and not simply repeating what cooperators or witnesses told her.

This second point was especially salient because, as Agent Dayton conceded during voir dire in the presence of the jury, she had debriefed several cooperating cocon-spirators in this very case, including at least one of whom speaks only Spanish, Garcia’s supplier, Yoni Rodriguez. Thus, it was particularly apparent at the start of trial (especially as Agent Dayton had never before testified as an expert) that special care was required to expose to the jury (and to the defense) the true bases for Agent Dayton’s “opinions.”

In any event, cognizant of these perils, the district court issued a cautionary instruction to the jury prior to Agent Dayton’s testimony. It informed the jury that a distinction would be made between the agent’s fact and expert opinion testimony, and that it would not be appropriate for Agent Dayton to testify to the meaning of a word simply because a witness told her its meaning.

In the course of the investigation in this case, investigators had monitored in whole or in part more than 20,000 calls; the Government considered 1,928 conversations pertinent to the investigation and it played approximately 211 of such monitored calls, many in Spanish, during the trial. Agent Dayton testified repeatedly over the course of six days, “interpreting” words in nearly half the calls played before the jury, including the English translations of the conversations occurring in Spanish. The defense made countless objections regarding the melding of her fact and expert opinion testimony, the agent’s failure to adhere to her professed methodology, and the absence of foundations for many of her specific interpretations. While the district court sustained some objections, for the most part, it admitted the bulk of Agent Dayton’s testimony, which was often elicited through a prosecutor’s insertion of a simple prologue to a question: “Agent Dayton, in your expert opinion ...” or “Agent Dayton, based on your expertise ____”

As mentioned, Agent Dayton’s testimony was interspersed with the testimony of other investigators. After a 12-day trial, the jury returned a verdict of guilty on all five counts in which Garcia was named (one count of conspiracy to distribute heroin, and four counts of possession with intent to distribute heroin), but a mistrial was declared as to the two codefendants. The district court entered judgment and sentenced Garcia to 188 months on all counts, to run concurrently.

Garcia timely appealed and we exercise jurisdiction pursuant to 28 U.S.C. § 1291.

II.

Garcia presents a number of arguments on appeal regarding the admissibility of Agent Dayton’s expert testimony. While we find that the district court did not abuse its discretion in qualifying Agent Dayton as an expert, we are persuaded that the district court failed to adequately safeguard against a strong likelihood of jury confusion between Agent Dayton’s testimony as an expert witness and as a fact witness, and in ensuring that, in her capacity as an expert witness, Agent Dayton “reliably applied” her methodology and adequately set forth the underlying permissible foundations for her opinions.

A.

Garcia contends that the district court abused its discretion in qualifying Agent Dayton as a decoding expert in the first place, and that therefore the district court erred in admitting her testimony, without regard to the special dangers inherent in the use of decoding experts. We disagree.

We review a district court’s decision to qualify an expert witness, as well as the admission of such testimony, for abuse of discretion. United States v. Wilson, 484 F.3d 267, 273 (4th Cir.2007). “A court abuses its discretion if its decision is ‘guided by erroneous legal principles’ or ‘rests upon a clearly erroneous factual finding.’ ” United States v. McLean, 715 F.3d 129, 142 (4th Cir.2013) (quoting United States v. Johnson, 617 F.3d 286, 292 (4th Cir.2010)).

The Federal Rules of Evidence provide that a

witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:
(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;
(b) the testimony is based on sufficient facts or data;
(c) the testimony is the product of reliable principles and methods; and
(d) the expert has reliably applied the principles and methods to the facts of the case.

Fed.R.Evid. 702. In the advisory committee notes following the 2000 amendments to the Rule, there is a pointed discussion on the use of hybrid witnesses at issue here:

For example, when a law enforcement agent testifies regarding the use of code words in a drug transaction, the principle used by the agent is that participants in such transactions regularly use code words to conceal the nature of their activities. The method used by the agent is the application of extensive experience to analyze the meaning of the conversations. So long as the principles and methods are reliable and applied reliably to the facts of the case, this type of testimony should be admitted.

Fed.R.Evid. 702 advisory committee’s note (emphasis added).

Garcia’s argument that Agent Dayton’s experience was insufficient to qualify her as a decoding expert, given the emphasis on experiential knowledge for these types of experts, misses the mark. We think Fourth Circuit law in this area is reasonably clear. Wilson, 484 F.3d at 267; see also United States v. Galloway, 749 F.3d 238, 243-46, No. 12-4545, 2014 WL 1424939, at *4-6 (4th Cir. April 15, 2014) (applying plain error review).

In Wilson, we reviewed the district court’s admission of a police investigator’s expert and fact witness testimony; the investigator’s expertise was in “deciphering] the intercepted communications that were recorded via wiretap” as they “were full of drug slang and street code[.]” Id. at 273. We found no abuse of discretion in the district court’s admission of the expert evidence. There, the expert’s qualifications included nine years spent mostly investigating drug traffickers and attendance in training programs offered by the DEA, training which specifically included exposure to drug traffickers’ use of coded language. Id. at 275-76.

While Agent Dayton certainly has less experience than the witness in Wilson, and had never before this case been qualified as an expert, the precise number of years of an investigator’s experience, or the number of investigations on which she has worked, is not necessarily dispositive. Every expert has a first time. It is the quality of Agent Dayton’s experience, especially her exclusive focus on narcotics trafficking for the preceding five years, on which the district court properly focused in finding her qualified as a decoding expert.

Agent Dayton’s job required her “to work in close proximity with drug users on a daily basis,” United States v. Baptiste, 596 F.3d 214, 218 (4th Cir.2010); she had “monitored dozens of wiretaps, listening to thousands of intercepted phone calls in the process,” United States v. Mack, 495 Fed.Appx. 359, 364 (4th Cir.2012) (internal quotations omitted); and in the “course of [her] extensive experience dealing with drug traffickers and confidential informants, [ ] was able to learn to understand the vernacular used by drug traffickers.” Wilson, 484 F.3d at 275-76.

Garcia also takes issue with Agent Dayton’s explanation of her methodology. The defendants in Wilson similarly challenged the investigator’s methodology, claiming that it was “neither sufficiently explained nor reliable.” 484 F.3d at 273. Agent Dayton explained her methodology as follows: “I think we see some common [terms], receipts or paper or clothing terminology. So[me] of it’s common and some is just in the context of the conversation. As you listen to the conversation, the language used doesn’t make sense, and sometimes they switch up code so the context is all wrong.” J.A. 285. This is very similar to the witness’s explanation in Wilson, 484 F.3d at 275 (explaining that the expert relied on “the context of the call” to see the “pattern that develops.”). While this explanation is not as coherent as we might wish, given the deferential standard of review applicable here, it passes muster as an articulation of an accepted methodology. Agent Dayton also based her opinions in part on conversations she had with witnesses in other cases who “would talk about the drug shops and the language that was used to communicate with one another.” J.A. 294.

Given the similarity between Agent Dayton’s explanation of her methodology and the methodology we found acceptable in Wilson, we cannot say that the district court abused its discretion in qualifying Agent Dayton as an expert. As we have recognized, “[a] ‘trial judge must have considerable leeway in deciding in a particular case how to go about determining whether particular expert testimony is reliable.’ ” Wilson, 484 F.3d at 273 (quoting Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999)). Agent Dayton met the requirements for testifying to the meaning of coded language recorded from the coconspirators’ calls, and adequately explained her methodology. Accordingly, no abuse of discretion infects the district court’s determination to accept Agent Dayton as a decoding expert.

B.

Despite the district court’s careful attention to Agent Dayton’s credentials as a decoding expert, however, we hold that the agent’s testimony was fraught with error arising from the problems the district court itself identified early in the trial: the conflation of Agent Dayton’s expert and fact testimony, particularly her reliance on her knowledge of the investigation to support her coding interpretations; her failure to apply her methodology reliably; and last, her failure to state on the record an adequate foundation for very many of her specific interpretations. Moreover, because Agent Dayton’s testimony was so extensive and most likely highly influential in the jury’s evaluation of the Government’s case against Garcia, we are constrained to hold that these flaws deprived Garcia of a fair trial, i.e., that the missteps were not harmless, and thus require vacatur of Garcia’s convictions.

1.

Garcia contends that there were inadequate safeguards to protect the jury from conflating Dayton’s testimony as an expert and fact witness. We agree.

We have recognized that individuals who testify as expert and fact witnesses can cause jury confusion, and such a manner of proceeding is only “acceptable where ‘the district court took adequate steps ... to make certain that [the witness’s] dual role did not prejudice or confuse the jury.’ ” Baptiste, 596 F.3d at 224 (quoting Wilson, 484 F.3d at 278 n. 5) (alternations in original). Such safeguards might include requiring the witness to testify at different times, in each capacity; giving a cautionary instruction to the jury regarding the basis of the testimony; allowing for cross-examination by defense counsel; establishing a proper foundation for the expertise; or having counsel ground the question in either fact or expertise while asking the question. Id. at 224 (citing Wilson, 484 F.3d at 278 n. 5 and United States v. Farmer, 543 F.3d 363, 370-71 (7th Cir.2008)).

While a cautionary instruction might be sufficient in many instances to protect against jury confusion and resulting prejudice, see Galloway, 749 F.3d at 244-46, 2014 WL 1424939 at *5-6, we are persuaded that the instruction given here was insufficient to mitigate the potential for prejudice.

The district court represented to the jury that Government counsel would “be clear in their questions” whether they were asking Agent Dayton to testify based on the facts versus her expertise. J.A. 327. Despite this direction, the Government failed to do so: there were repeated instances of Agent Dayton moving back and forth between expert and fact testimony, with no distinction in the Government’s questioning or in Agent Dayton’s answers. And this was true despite the fact that the Government recalled Agent Dayton to the stand eighteen times. In light of the court’s earlier assertion that counsel would clearly distinguish the two types of testimony, the jury reasonably might have assumed that all of Agent Dayton’s testimony in response to questions asking for her expert opinion was indeed based on her decoding expertise.

Our review of the record reveals multiple occasions in which the Government prompted Agent Dayton to assert information garnered from her participation in the investigation, having nothing to do with her ostensible decoding expertise. For instance, in response to a question about any coded language on the call “based on [her] expertise,” Dayton answered that “[w]hen Mr. Powell uses the term show time, he is letting Mr. Coley know he has heroin.” J.A. 371-72. Immediately after this exchange, without any further explanation of the term “show time” or warning that they were shifting away from Agent Dayton’s expertise to her factual knowledge, the prosecutor then asked Agent Dayton: “Now how were Ms. Montgomery and Mr. Coley identified as the participants in the calls that we’ve just seen or heard?” J.A. 372. Agent Dayton’s response had nothing to do with her expertise, and everything to do with her factual knowledge as an investigator in this case.

It is apparent even from the Government’s briefing that Agent Dayton used her personal knowledge of the investigation to form (not simply to “confirm”) her “expert” interpretations: “Special Agent Dayton also looked to the actual seizures of heroin in this case to form the basis of her expert opinion.” Appellee Br. 46 n. 8.

Notably, the Government points to a call in which Agent Dayton interpreted “a hundred forty five point,” to mean “145 grams of heroin,” J.A. 885, which is confirmed with evidence that police seized 145 grams of heroin from one of the coconspirators on the call. But unlike the display of expertise we endorsed in United States v. Johnson, this is not Agent Dayton “applying [her] expertise, derived over many years and from multiple sources, to interpret the transcripts of phone conversations.” 587 F.3d 625, 636 (4th Cir.2009). Instead, this exemplifies occasions, and we discern many of them, in which Agent Dayton simply substituted information gleaned from her participation in the investigation (including post-indictment debriefings of participants in the conspiracy) for ostensible expertise. For this, the Government need not have called a “decoding” expert at all: simply pointing to the seizure of 145 grams of heroin, and then the repeated mention of “145” in this call clearly would have been enough for any juror to make the connection. But cloaking this connection in the guise of expert testimony goes beyond what is contemplated under Federal Rule of Evidence 702, which requires an expert to “reliably appl[y] the principles and methods” for which she was qualified as an expert. The Rule contemplates that an expert’s opinion testimony will be “helpful to the jury,” not merely helpful to the prosecutor as transmutations of simple fact testimony. Cf. Wilson, 484 F.3d at 276-78 and nn. 4 and 5 (emphasizing need for the district court to enforce methodological, foundational, and helpfulness standards in admitting testimony of decoding expert, and affirming convictions under plain error review because, despite district court’s erroneous admission of some testimony, expert’s “methodology was reliably applied in the vast majority of instances ”) (emphasis added).

This issue actually surfaced early in the trial. During the Government’s voir dire of Agent Dayton in its effort to satisfy the district court that the agent had genuine “expertise,” the following colloquy occurred:

GOVERNMENT: Special Agent Dayton, regarding the wiretaps in this case, you have spoken to at least three members of the organization?
DEFENSE: Objection, Your Honor, leading. THE COURT: Overruled.
GOVERNMENT: Is that correct?
AGENT DAYTON: At least three, maybe more.
GOVERNMENT: Maybe more. To confirm your understanding of the coded language used in this case?
AGENT DAYTON: Yes.

J.A. 296. Later, the defense had Agent Dayton identify the three coconspirators whom she had debriefed, and she stated the following: “Diego Amparo, Yoni Rodriguez, Nancy Feliciano. I’m trying to think of the other folks who have come in. I’m sorry. Those are the three I believe we discussed code with.” J.A. 298. In light of this significant ingredient in Agent Dayton’s expertise, it was incumbent upon the Government to demonstrate that Agent Dayton was not merely channeling information and statements by non-testifying participants in the conspiracy into the trial record.

In Johnson, we expressed wariness over this exact problem, and cautioned against “[a]llowing a witness simply to parrot out-of-court testimonial statements of cooperating witnesses and confidential informants directly to the jury in the guise of expert opinion” as it “would provide an end run around Crawford.” 587 F.3d at 635 (internal quotation marks omitted). In Crawford v. Washington, the Supreme Court held that the Confrontation Clause permits the introduction of “[tjestimonial statements of witnesses absent from trial ... only where the declarant is unavailable, and only where the defendant has had a prior opportunity to cross-examine.” 541 U.S. 36, 59, 124 S.Ct. 1354, 158 L.Ed.2d 177 (2004). Of the three co-conspirators Agent Dayton specifically identified as contributing to her “understanding of the coded language used in this case,” though by her own admission there may have been more than three, only one testified at trial. We explained that in the face of such risks, “[t]he question is whether the expert is, in essence, giving an independent judgment or merely acting as a transmitter for testimonial hearsay.” Johnson, 587 F.3d at 635.

Here, we cannot say that Agent Dayton was giving such independent judgments. While it is true that she never made “direct reference to the content of [her] interviews,” id, this could just as well have been the result of the Government’s failure to elicit a proper foundation for Agent Dayton’s interpretations. See J.A. 367-68; supra 388 n. 4 (discussing the prosecutor’s statement in response to an objection for lack of foundation, that she was “trying to stay away from the objections raised by counsel regarding her, Special Agent Dayton testifying about what she learned from persons in this case, but it does form the basis, part of the basis of her opinion. I can certainly elicit that, but then I would think there’s going to be an objection to her testifying about that being the basis of her opinion.”).

Although Agent Dayton asserted that her after-the-fact debriefing of coconspira-tors who had commenced cooperating with the prosecution served to “confirm[ ]” her independent assessments of the meanings to be attributed to many words, it was imperative that the trial record demonstrate this sequence with a measure of clarity. The record is devoid of evidence that this was, in fact, the sequence of Dayton’s analysis, to Garcia’s prejudice.

2.

We find a second, equally fundamental flaw in the presentation of Agent Dayton’s testimony in addition to the above infirmity: the lack of foundations laid for each interpretation testified to, so much so that we are compelled to conclude that the record fails to demonstrate the requisite reliability in Agent Dayton’s execution of her claimed methodology. Call Number 214, referenced above regarding Agent Dayton’s interpretation of the term “show time” to mean “heroin,” is an illustrative example of the utter absence of any foundation for more than simply a few of her coding interpretations. No further explanation regarding the term “show time” was given — Agent Dayton did not explain that she had seen that term used in this investigation, let alone in her previous experience. There was no explanation as to what, in the context of the call or otherwise, led her to believe that “show time” meant “heroin.”

One significant indication of Agent Dayton’s failure to adequately explain her methodology is her lack of explanation as to how her methodology was affected by the fact that many of the telephone conversations were in Spanish. As Agent Dayton explained, the context of a conversation is an important factor when decoding suspected drug language. Although the jury heard English language transcripts of the actual call recordings read aloud, Agent Dayton provided no insight as to what steps, if any, she took to ensure that the context of the conversation was not lost or meaningfully altered in the process of translating the calls from Spanish to English.

The problem with this lack of foundation and reliably applied methodology becomes even more apparent when examining the instances when Agent Dayton’s interpretations are inconsistent with each other. For example, she testified that the use of “2” by the conspirators means either $200, J.A. 600, or $2,000, J.A. 533. In fact, in the explanation for interpreting “2” to mean “$2,000,” Agent Dayton testified that drug traffickers “drop zeros [sic] to make the numbers sound smaller.” Id. This would be an explanation for both of these interpretations, and there is no indication in the record why Agent Dayton’s expert methodology reasonably leads her to conclude that the same term means $200 in one instance, but $2,000 in another.

A like inconsistency is seen with regard to the number “5”: Agent Dayton first interprets this as “$5,000,” J.A. 854, and in another instance as 500 grams of heroin, J.A. 925. While there might very well be an explanation for this, based on Agent Dayton’s expertise, she did not explain it at trial to the jury, and it is not evident from the record what that explanation would be.

In another instance, Agent Dayton testified that, in her expert opinion, “590” was code for 590 grams of heroin. J.A. 774. She gave no explanation for the basis of this opinion, and importantly, her testimony that the number 590 actually meant 590 is inconsistent with the only methodology she offered for how she decoded the numbers heard in the calls: that the dealers spoke in code about numbers by dropping the zeroes.

In several instances, Agent Dayton “decoded” words and phrases that needed no expert translation at all since the meaning was either apparent on its face or apparent with contextual information that any fact witness could have provided. See, e.g., J.A. 406-09 (testimony that the phrases “first one” and “second one” were code for different deliveries of heroin); J.A. 410, 412 (testimony that the phrase “over there” was code for Baltimore); J.A. 854, 857, 875, 1190 (testimony that the word “stuff’ was code for heroin); J.A. 940 (testimony that the word “number” was code for price in the phrase “the number they gave me was too high. We’re, we’re on the number he told me.”); J.A. 958 (testimony that the word “number” was code for price in the phrase “The important thing is the quality and a good number.”) This purported “decoding” of language that did not actually need decoding casts further doubt on whether Agent Dayton was reliably applying her methodology. Unlike in Wilson, where we found that the expert “applied his methods and principles reliably in the vast majority of his testimony,” 484 F.3d at 277, we cannot say the same is true of the record before us here. The record is replete with instances of Agent Dayton providing no explanation for her interpretation, other than a token reference to her expertise in the Government’s framing of questions. While the district court was appropriately careful in its initial examination of Agent Dayton’s qualifications to testify as an expert, it failed to maintain its “gatekeeper” role throughout that testimony, Gen. Elec. Co. v. Joiner, 522 U.S. 136, 142, 118 S.Ct. 512, 139 L.Ed.2d 508 (1997), and the Government did little, if anything, to protect the generous ruling it had obtained from the district court from morphing into error.

III.

The Government contends that even if it was error for the district court to admit Agent Dayton’s testimony as it did, the error was harmless because there was enough evidence outside of Agent Dayton’s testimony to justify the jury’s conviction of Garcia on all five counts. We disagree with the Government’s harmlessness metric.

When nonconstitutional error has been established in a criminal case, “the Government must demonstrate that the error did not have a ‘substantial and injurious effect or influence in determining the jury’s verdict.’ ” United States v. Curbelo, 343 F.3d 273, 278 (4th Cir.2003) (quoting Kotteakos v. United States, 328 U.S. 750, 776, 66 S.Ct. 1239, 90 L.Ed. 1557 (1946)). The Government points to other evidence presented at trial, relying on the principle that “where there is a significant amount of evidence which inculpates a defendant independent of the erroneous testimony, the error is considered harmless.” Johnson, 617 F.3d at 295 (citing United States v. Banks, 482 F.3d 733, 741-42 (4th Cir.2007)). The Government’s reliance on the dictum from Johnson, in which we refused to find the error harmless, is seriously misplaced.

Indeed, just as in Johnson, the Government’s case here is not ironclad. The Government first points to the testimony of Rodriguez, who testified that he regularly sold heroin to Garcia, interpreted certain terms, and no doubt was an impressive witness. Rodriguez’s credibility was put into question, however, not least because his testimony was in return for sentencing considerations by the Government in a New York prosecution in which he faced a maximum potential sentence of life in prison and, in addition or alternatively, deportation to the Dominican Republic, perhaps leaving behind his American citizen spouse. Of course, the jury was unquestionably entitled to credit the testimony of Rodriguez, every bit as much as the Government was entitled to call him as its witness. Nevertheless, Rodriguez’s testimony required searching scrutiny, for, as the prosecutor explained to the jury in its closing argument, “when you’re trying the devil, sometimes you’ve got to go to hell to get your witnesses.” J.A. 1735.

The Government also points to police surveillance of the coconspirators, in Baltimore and elsewhere, but save one instance when Garcia was seized while directly involved with drugs in April 2009, these observations hold no determinative weight absent the addition of the illuminating wiretap calls and the coded language referring to drugs. Each of the other counts for possession, and Garcia’s involvement in the conspiracy, rests on the connection between various individuals that was only established through the hundreds of calls played at trial. And while the contents and context of many of these calls indisputably point to illegal activity, we are unable to hold that the jury was unaffected by Agent Dayton’s unadorned interpretations.

These fundamental flaws are exemplified in the very piece of evidence the Government points to against Garcia in its harmlessness argument: Feliciano’s call to Garcia after drugs were seized from her bags at a travel plaza. In testifying about that call, between Feliciano and Garcia, Agent Dayton opined that the term “stuff’ was “code for heroin.” J.A. 1190. As with so much of Agent Dayton’s testimony, no foundation was laid, and there was no explanation for the coding interpretation, leaving us to speculate that perhaps the basis of her opinion was the fact that the agents had just seized 500 grams of heroin from Feliciano. Without the remotest effort by the Government to justify such testimony under Rule 702, Agent Dayton’s “expert opinion” was little more than an expert veneer glossed on an item of evidence, e.g., the meaning of the term “stuff,” that the jury was eminently able, if it so chose, to find the meaning of for itself. But the jury was deprived of an opportunity to put to use its commonsense, mature judgment by the Government’s gilding of the expert witness lily.

Even if there was more uncompromised evidence, as Garcia points out, this Court’s inquiry is not “ ‘merely whether there was enough [evidence] to support the result, apart from the phase affected by the error. It is rather, even so, whether the error itself had substantial influence.’ ” Curbelo, 343 F.3d at 286 (quoting Kotteakos, 328 U.S. at 765, 66 S.Ct. 1239). Here, in a trial spanning twelve days, Agent Dayton testified on six different days, recalled to the stand eighteen times. From the beginning of the trial to the end of the trial, the calls and the meaning of the words used in those calls were the centerpiece of the Government’s case. There was little direct evidence connecting Garcia to three of the four actual possession charges, and law enforcement never observed Garcia actually exchanging drugs or money with any coconspirators. We cannot find Agent Dayton’s testimony harmless under the circumstances.

IV.

In Wilson, we criticized defense counsel for failing to react aggressively to nudge the district court to better exercise its “gatekeeping” responsibilities in respect to a law enforcement witness decoding expert. Wilson, 484 F.3d at 278 n. 5 (“Appellants deserve some of the blame for those rare instances where improper testimony slipped through the gate’s cracks.”). Here, no such scolding is appropriate. Garcia timely and repeatedly objected regarding the foundational sufficiency and methodological reliability of the agent’s expert testimony, and he specifically pointed to the risk of prejudice arising from the agent’s dual capacity as both an expert and fact witness. Counsel objected early and often, always respectfully and, on occasion, with success. But even when the district court sustained some objections (after a bench conference), the Government would often “move on” to its next question or its next area of interest, leaving prejudicial effects hanging in the air. Although here, as usual, the quantum of prejudice is not susceptible of exact measurement, we are persuaded that Garcia has established his entitlement to relief.

For the reasons set forth above, the judgment is vacated and the case is remanded for further proceedings not inconsistent with this opinion.

VACATED AND REMANDED. 
      
      . Garcia’s convictions on this and the two subsequent incidents are based on an aiding and abetting theory, tracing the drugs seized back to the heroin Garcia had obtained from Rodriguez and then distributed to dealers in Baltimore.
     
      
      . It appears, and we presume, that there was an understanding at trial that an objection made by any one of the three defense counsel was joined by the others.
     
      
      . The district court told the jury:
      The fact that I am determining that she has sufficient qualifications to offer you opinion testimony, again, I'm not deciding what weight you give her testimony or how you evaluate it. It’s up to you, as I said at the beginning, to pay attention to her experience, and her background, and training, and the reasons that she gives you for any opinion that she may offer, and then you decide what weight to give it.
      The other thing I want to say is that she may also be offering more fact testimony, just an ordinary lay witness or fact testimony about certain matters, and I have asked counsel to be clear in their questions which is which.
      To the extent she is offering you just fact testimony, something she actually saw or heard, that should be clear. To the extent it’s her opinion, based on her experience and review of these calls, we'll make that clear as well, and she will be asked to explain what the reason for her opinion is.
      But I want to be clear, any opinion that she may be offering you about these calls is not to be based on what somebody, who may or may not have been involved in this case, told her. Mr. Jones said that means cocaine, for example.
      That's not the basis of her opinion. That's not appropriate. She is not here to just repeat something that somebody may or may not have said to her when we don't have that person here for you to listen to, and make your own opinion about credibility-
      So her opinion, and we will be careful to limit it to that, is to the extent she has got a basis, on her experience and training and everything she told you about yesterday, to have a reason to believe that a certain reference means something that is not apparent on its face, she will explain that, and it is up to you to evaluate it again. But it is not because it is based on anybody that she may or may not have talked to in this case about the meaning of certain words.
      J.A. 326-28.
     
      
      . One notable objection early in Agent Dayton’s testimony came after she testified as follows after listening to a wiretap recording:
      AGENT DAYTON: In the second line of Mr. Coley speaking, wherein he says I owe you 200, he is actually talking about $2,000. In that same sentence, when he says I have a hundred and 50 for you, he is actually referring to $1500.
      GOVERNMENT: Are these amounts in reference to, based on your expert opinion, in reference to, these dollar amounts, in reference to anything in particular?
      AGENT DAYTON: These are in reference to — Mr. Powell has supplied Mr. Coley with heroin, and Mr.—
      J.A. 364. Defense counsel objected at this point, arguing that Agent Dayton had gone "beyond the spectrum of what she is capable of doing" by "opin[ing] as to ... the general nature of this particular conversation, and what the conversation means[.]” J.A. 365. Counsel further argued that "there has been no foundation or no explanation as to how she came to some of these conclusions. She is ... just saying this is what this is about, and we are listening to these calls isolated, without any proper context.” Id.
      
      The district court astutely summed up the issue:
      The problem is that [Agent Dayton] just said, in response to your question about how she knows that the 200 is a reference to 2,000, she just said because I know that Mr. Coley supplied heroin to Mr. Powell. That is not based on her expertise or the coded language. She is using fact information that she has gained otherwise, which is not appropriate.
      J.A. 366-67. The prosecutor responded by stating that she was
      trying to stay away from the objections raised by counsel regarding her, Special Agent Dayton testifying about what she learned from persons in this case, but it does form the basis, part of the basis of her opinion. I can certainly elicit that, but then I would think there’s going to be an objection to her testifying about that being the basis of her opinion.
      J.A. 367-68.
      The district court ultimately sustained the objection (as it did some others along the same vein) after the prosecutor further stated that "[r]egarding the numbers that Mr. Coley uses, she is in part relying on the investigation which showed that Mr. Coley and Mr. Powell met,” to which the district court responded, "Then it’s hearsay and it’s not an expert opinion.” J.A. 369.
      The fact that the prosecutor hesitated to lay an adequate foundation for Agent Dayton's opinions for fear of objection from defense counsel speaks volumes to the impermissible basis for that foundation. See also J.A. 375 (district court sustained objection because Dayton’s testimony that "cousin” meant a specific individual was on the basis of her personal knowledge, not on expertise); J.A. 452 (district court overruled objection based on lack of foundation despite term only heard in this investigation); J.A. 461 (district court overruled objection and advised Agent Dayton to "rely on what [she] learned prior to this investigation”); J.A. 573 (district court overruled objection where foundation for interpretation was that "Mr. Garcia uses the term the [sic] G. We see him use that in other calls on this wire to refer to a thousand dollars.”).
     
      
      . As we discuss infra n. 10, this formulaic approach hardly cured the problems that the district court anticipated and that we find actually to have arisen in the course of trial.
     
      
      . Garcia also challenges the sufficiency of the evidence on Count 4 (the April 19, 2011 Powell/Montgomery transaction), for possession of heroin with intent to distribute. Garcia’s conviction on that count, as on two of the other three of the substantive counts, was based on an aiding and abetting theory. We find the evidence was sufficient to prove beyond a reasonable doubt that Garcia "knowingly associated himself with and participated in the criminal venture.” United States v. Winstead, 708 F.2d 925, 927 (4th Cir.1983) (citing Nye & Nissen v. United States, 336 U.S. 613, 619, 69 S.Ct. 766, 93 L.Ed. 919 (1949) (further citations omitted)); see also Rose-mond v. United States,-U.S.-, 134 S.Ct. 1240, 1245-51, 188 L.Ed.2d 248 (2014) (explaining aiding and abetting principles). ”[P]articipation in every stage of an illegal venture is not required, only participation at some stage accompanied by knowledge of the result and intent to bring about that result.” United States v. Burgos, 94 F.3d 849, 873 (4th Cir.1996) (en banc) (internal citations omitted). Rodriguez testified that he sold heroin to Garcia, which Garcia then sold to Powell for further distribution to customers like Montgomery. The relationship between Garcia and Powell was established by the wiretap calls, as well as surveillance of Garcia and Powell together. Furthermore, on the very day Montgomery was apprehended in possession of heroin, a conversation between Garcia and Powell was recorded on Montgomery’s voicemail, in which Garcia and Powell discussed their drug trafficking plans as interpreted by Agent Dayton.
      We note that even though we find error in the manner in which Agent Dayton’s testimony was presented to the jury, we must still consider that testimony in reviewing the denial of a Rule 29 motion for judgment of acquittal. United States v. Diaz, 300 F.3d 66, 77 (1st Cir.2002); see also United States v. Horton, 693 F.3d 463, 482 (4th Cir.2012) (Davis, J., concurring) (discussing the necessity of ruling on appellate claims of evidentiary insufficiency even in the context of trial error requiring reversal).
      Of course, we intimate no view as to whether upon a retrial the Government will adduce sufficient evidence to support conviction as to any count of the indictment.
     
      
      . Garcia also argues that the government's pretrial disclosures regarding Agent Dayton were insufficient. Federal Rule of Criminal Procedure 16(a)(1)(G) requires that "[a]t the defendant’s request, the government must give to the defendant a written summary of any testimony that the government intends to use under Rules 702, 703, or 705 of the Federal Rules of Evidence during its case-in-chief at trial.... The summary provided under this subparagraph must describe the witness's opinions, the bases and reasons for those opinions, and the witness's qualifications.” Garcia maintains that the information contained in the Government’s September 24, 2012 letter, and Dayton’s attached resumé, was insufficient to meet this requirement.
      In light of our ultimate ruling on the propriety of Agent Dayton’s testimony, we need not address the adequacy of the Government's compliance with Rule 16(a)(1)(G).
     
      
      . In Galloway, the district court repeatedly instructed counsel to be clearer in the demarcation between fact and lay opinion testimony. See, e.g., Joint Appendix at 472-75, 572-73, 749 F.3d 238, 2014 WL 1424939, ECF No. 59. There, the district court generally did not allow the expert to testify about the meaning of individual words in particular conversations, but rather only permitted the expert to testify whether, in his expert opinion, coded language was used in the conversation at issue as a general matter. There was little opportunity, then, for the expert's participation in the underlying investigation to encroach upon, enhance, or improperly bolster the expert testimony of that witness.
     
      
      . But even if she did not, the record is bereft of evidence that she did not, due largely to the Government’s deliberate failure to lay an adequate foundation for many of her opinions. For our purposes, this absence of clarity as to the source of her opinions amounts to the same thing as if she had no expertise at all.
     
      
      . In what is truly a curious argument, the Government contends that because Agent Dayton "did not offer any expert opinion regarding any calls by Yoni Rodriguez, and discussed only a handful of calls involving Diego Amparo and Nancy Feliciano,” Appel-lee Br. 46, there should be no concern about Agent Dayton’s ostensible methodologically sound expertise. Exactly the opposite is true. There is no discernible correlation favorable to the Government between Agent Dayton's debriefing of coconspirators whose calls she interpreted and those whose calls she did not interpret. The point is that it was one conspiracy and there is no reason to believe that Rodriguez, in particular, would have been unable to provide substantial information concerning the operational details of the conspiracy. And in fact, he did so in his testimony, including "interpretations” of many of the same words Agent Dayton interpreted. In short, the funneling of fact information learned from those inside the conspiracy and the offer of expert testimony was melded in a highly prejudicial manner in this case.
      Relatedly, the Government seems earnestly to contend that simply by including in its questions to Agent Dayton that the agent answer only based on her "expert opinion” somehow insulates the agent’s testimony from ordinary scrutiny under the settled principles of Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), and its progeny and Rule 702. The Government is wrong. It is bootstrapping of the worst kind to suggest to a jury that it should believe that everything a witness says is based on expertise gained from independent knowledge and experience in the absence of a record demonstrating as much. We would never permit counsel calling a physician or an engineer or an expert tightrope artist to proceed in such a manner; there is scant reason we should allow a prosecutor calling law enforcement officers as experts to do so.
     
      
      . And even where Agent Dayton stated (as she did from time to time) that a term had been seen in other calls during the investigation, this assertion alone is no explanation for the threshold interpretation. Indeed, the fact that the expert has heard a term in other calls during the investigation does not, by itself, explain how the expert knows what the term means. As discussed in text regarding the "interpretation” of numerals, the same term means different things at different times, even in the same conspiracy. The Government failed to show how Agent Dayton reliably leapt these lexicographical crevasses.
     
      
      . In Johnson, the prosecution persuaded the district court to admit the decoding expert’s opinion testimony as lay opinion evidence under Rule 701. We found error in that ruling, 617 F.3d at 292-93, and we declined to accept the prosecution’s fallback contention that the testimony could have been admitted as expert opinion evidence under Rule 702, id. at 294-95, and we declined to find that the error was harmless. The case at bar is, like Johnson, but another example of the vagaries in theoretical, methodological, and foundational challenges such testimony engenders and why, when not properly managed, there is an increased likelihood of error, perhaps harmless in some cases, perhaps not in others. Cf. Galloway, 749 F.3d at 243-46, 2014 WL 1424939 at *4-6 (affirming convictions after admission of decoding expert testimony upon plain error review); United States v. Hassan, 742 F.3d 104, 135 (4th Cir.2014) (affirming after finding proper admission of lay opinion testimony).
     