
    UNITED STATES of America, Appellee, v. Andre GORE, aka Sha; Kwaun Counts; Danny Johnson; Arnold Craig Mullins, aka Fruit, aka Ronald Davis; Julius Holloway, aka Kenneth Robinson, aka Bill Hayes; Corey Roche, aka Jeffrey L. Preston, aka Cory Williams, aka Kasheen Roche; James Brown; Elliot Hernandez; Dwight Hughes, Joseph Lane; Andre Moore; Jose Rolon; Jamade Humbert; Kareem Billups, aka Beebo; Sean Billups, aka Sean Tedder; Sharnseearay Goddard, aka Diana Brassell; Reno Lyons; Reginald Rice; Tauheedah Walker; Edward Curtis Barnes, aka FNU LNU John Doe, aka Norman Flick; Willie L. Tedder, aka Ike, Alfredo Cohoon, Defendants, Harvey Wells, Defendant-Appellant.
    Docket No. 97-1027.
    United States Court of Appeals, Second Circuit.
    Argued Dec. 10, 1997.
    Decided Aug. 21, 1998.
    
      David A. Lewis, New York City (Henriette D. Hofñnan, The Legal Aid Society, Federal Defender Division, Appeals Bureau, New York City, of counsel), for Defendant-Appellant.
    Bernard J. Malone, Jr., Assistant United States Attorney, Albany, NY (Thomas J. Ma-roney, United States Attorney for the Northern District of New York, Donald T. Kinsella, Assistant United States Attorney, Albany, NY, of counsel), for Plaintiff-Appellee.
    
      Before: OAKES, PARKER and WOOD, Circuit Judges.
    
      
       The Honorable Harlington Wood, Jr., of the United States Court of Appeals for the Seventh Circuit, sitting by designation.
    
   PARKER, Circuit Judge:

Defendant Harvey Wells (“Wells”) appeals from a judgment of conviction and sentence entered January 6,1997, in the United States District Court for the Northern District of New York (Frederick J. Scullin, Judge) following a trial by jury. Wells was convicted of: (i) engaging in a criminal conspiracy to distribute heroin, in violation of 21 U.S.C. § 846; (n) the distribution of heroin on July 13, 1995, in violation of 21 U.S.C. § 841; and (iii) the possession of heroin with the intent to distribute on July 13, 1995, in violation of 21 U.S.C. § 841, which resulted in the imposition of a’sentence of 27 months’ imprisonment on each count, to be served concurrently, plus a term of three years’ supervised release, and a special assessment of $50 per count.

I. BACKGROUND

In the Spring of 1995, the Albany Police Department began a long-term investigation into heroin trafficking in Albany. Primarily, the police investigation relied on confidential informants — specifically, people arrested for drug-related crimes who agreed to cooperate with the government in exchange for leniency in charging and sentencing. In March 1995, the police arrested Bruce Taft (“Taft”) on narcotics charges, and he began working as a confidential informant.

On the morning of July 13, 1995, Taft met police Detectives Dennis Bradt and Thomas Fitzpatrick. He made a drug buy and gave the detectives the heroin he purchased. Then, Taft drove to another location, followed by the detectives. Taft went inside a building and bought two glassine envelopes of heroin, branded with the name “Fuji Power” from Wells. Taft paid $45.00 for 0.11 grams of heroin. He wore a concealed recording device. While Taft was in Wells’s presence, the following conversation was tape-recorded (and later introduced as evidence at Wells’s trial):

HARVEY: Come in.
BRUCE: Forty five.
HARVEY: That’s for now, what did you owe me, what did you owe me.
BRUCE: I owe a couple. I’m working. I’ll be back.
HARVEY: You guys, boy. I need some money myself. I need, I just can’t work with that.
BRUCE: Yeah, yeah, I know.
HARVEY: Hear what I’m saying. For now people pay 25 for Fuji ... because ... [inaudible] ... boy
BRUCE: Yeah, it is good.
HARVEY: You always get something good and people don’t appreciate it. Yeah man, I don’t want to lose face with that dude man because he always has something decent and he always comes up right. Never tapped, never in a bag, never messed up, yeah so I should do him right. Man ... [inaudible] ... Make money!
BRUCE: Do yourself right. Do yourself right. Take care of yourself.
HARVEY: What you mean, man?
BRUCE: Take care of yourself. I’ll be back, I’ll take care of it.
HARVEY: I’m trying ... doing that right by me ... [inaudible] ... I can feel good by doing right by me. [Inaudible]. You know man ... I [inaudible] don’t get a fucking soda out of it....
BRUCE: You look like you’re going to the beach.
HARVEY: I’m going to one of my spots ... read a book ... stuff I do. Man cool, calm collected. Chilly Willy.
BRUCE: Here you go....

On September 20,1995, the government filed a Superseding Indictment with 65 Counts charging 23 individuals, including Wells, with involvement in a narcotics conspiracy and in narcotics-related offenses. The district court granted Wells’s motion to proceed pro se with a court-appointed legal advisor, Thomas Flannery. Wells was the, only one of the many defendants charged in the Superseding Indictment whose case went to trial.

A four-day jury trial was held in September 1996. The government’s case for all three counts rested entirely on the single transaction between Wells and Taft. The government introduced the tape-recording of their conversation and questioned Taft about its contents on the stand. The government called other witnesses to demonstrate the validity of the tape-recording, to corroborate Taft’s account of the transaction, and to identify the drugs involved and their chain of custody. For instance, Detectives Bradt and Fitzpatrick of the Albany Police Department testified about making the tape-recording and their dealings with Taft as a confidential informant. In addition, the government called the arresting officer and two persons who worked for the federal Drug Enforcement Agency (“DEA”) — a chemist who testified about the nature and 'quantity of the drugs and an agent who testified about' the chain of custody of those drugs.

At trial, the government sought to show that because Wells sold heroin with the label “Fuji Power,” he was part of a conspiracy to distribute heroin using this brand name. While some ■witnesses testified to their familiarity with the heroin brand name “Fuji Power” and its prevalence in the Albany area, no evidence, other than this one sale, linked Wells with a conspiracy to distribute or possess with intent to distribute “Fuji Power” heroin. On summation, the government argued that because Wells had to have a supplier in order to sell his drugs, he was de facto part of a narcotics conspiracy to sell “Fuji Power.” Further, the government argued, that Wells’s comments on the tape-recording verified the existence of a supplier to whom Wells would “lose face” were he not paid by Taft. No evidence was presented as to the existence of the person who was supposedly mentioned by Wells in the tape-recording. No witness testified to Wells’s participation in any conspiracy nor to his knowledge of the illegal ends of such a conspiracy. Harvey Wells did not take the stand in his own defense.

On September 17, 1996, the jury rendered a verdict of guilty against Wells on all three counts as charged. On December 19, 1996, the district court held Wells’s sentencing hearing, finding an Offense Level of 12 and a Criminal History Category of V, which together carry a sentencing range of 27 to 33 months imprisonment. ■ As .a result, the district court sentenced Wells to three concurrent terms of 27 months’ imprisonment, three years’ supervised release and a special assessment of $50 per count.

On December 30,1996, Wells filed a timely notice of appeal. On March 18, 1997, this Court dismissed Wells’s appeal for failure to prosecute. Then, on April 8,1997, this Court re-instated Wells’s appeal sua sponte, ordering appointment of new counsel pursuant to the Criminal Justice Act.

II. DISCUSSION

Wells raises three principal issues upon appeal. First and foremost, he contends that the government failed to adduce evidence at trial sufficient as a matter of law to convict him of conspiracy to possess and distribute heroin. He claims that the July 13, 1995, tape recording of his conversation with confidential informant Taft and his one drug sale fail to demonstrate any conspiratorial agreement and thus the proof at trial was inconclusive to show his participation in a narcotics conspiracy under 21 U.S.C. § 846. Second, Wells argues that the district court erred as a matter of law by allowing the jury to convict him of both distributing a controlled substance ■ and possession with the intent to distribute that substance based on evidence of the same single sale of heroin. Finally, Wells contends that he should receive a new trial because of the spillover prejudice he suffered from the evidence introduced on the improper conspiracy conviction.

A. Sufficiency of the Evidence of Narcotics Conspiracy

“A defendant who challenges the sufficiency of evidence bears a heavy burden.” United States v. Pipola, 83 F.3d 556, 564 (2d Cir.1996), cert. denied, — U.S. -, 117 S.Ct. 183, 136 L.Ed.2d 122 (1996). This Court must consider “the evidence in the light most favorable to the government.” Glasser v. United States, 315 U.S. 60, 80, 62 S.Ct. 457, 86 L.Ed. 680 (1942). “We must credit every inference that the jury may have drawn in favor of the government.” United States v. Masotto, 73 F.3d 1233, 1241 (2d Cir.1996), cert. denied, — U.S. -, 117 S.Ct. 54, 136 L.Ed.2d 18 (1996) (quotation marks and citation omitted). The jury’s verdict must be sustained, if “any rational trier of fact could have found the essential elements of the crime beyond a reasonable doubt.” Jackson v. Virginia, 443 U.S. 307, 319, 99 S.Ct. 2781, 61 L.Ed.2d 560 (1979). These principles apply whether the evidence being reviewed is direct or circumstantial. See, e.g., Glasser, 315 U.S. at 80, 62 S.Ct. 457.

Wells was convicted of violating Section 846 of Title 21 of the United States Code. That section provides:

[A]ny person who attempts or conspires to commit any offense in this subchapter shall be subject to the same penalties as those prescribed for the offense, the commission of which was the object of the attempt or conspiracy.

21 U.S.C. § 846. In order to prove a conspiracy charge against a defendant, the government must present “ ‘some evidence from which it can reasonably be inferred that the person charged with conspiracy knew of the existence of the scheme alleged in the indictment and knowingly joined and participated in it.’ ” United States v. Sanchez Solis, 882 F.2d 693, 696 (2d Cir.1989) (quoting United States v. Gaviria, 740 F.2d 174, 183 (2d Cir.1984)). Both the existence of the conspiracy and the defendant’s participation in it with the requisite criminal intent may be established through circumstantial evidence. See, e.g., United States v. Tutino, 883 F.2d 1125, 1129 (2d Cir.1989). The defendant need not know the identities of all of the other conspirators, nor all of the details of the conspiracy. See Blumenthal v. United States, 332 U.S. 539, 557, 68 S.Ct. 248, 92 L.Ed. 154 (1947).

The essence of conspiracy is the agreement and not the commission of the substantive offense. See United States v. Abel, 258 F.2d 485, 489 (2d Cir.1958), aff'd on other grounds, 362 U.S. 217, 80 S.Ct. 683, 4 L.Ed.2d 668 (1960). Here, the record is devoid of any conspiratorial conduct. Without more, the mere buyer-seller relationship between Wells and Taft is insufficient to establish a conspiracy. See United States v. Beech-Nut Nutrition Corp., 871 F.2d 1181, 1191 (2d Cir.1989). The government contends that a conspiracy between Wells and his supplier is established through the tape-recorded conversation between Taft and Wells during the drug sale. Wells makes only one statement that mentions his involvement with another person: “I don’t want to lose face with that dude man because he always has something decent and he always comes up right. Never tapped, never in a bag, never messed up, yeah so I should do him right.” It is true that this remark may indicate Wells has a buyer-seller relationship with another person who is a source for his drugs. Without more, however, these remarks standing alone are legally insufficient to show a conspiratorial agreement to distribute drugs made between Wells and that unknown source.

As a matter of law, the crime of conspiracy must involve the agreement of two or more persons to commit a criminal act or acts “[s]ince the act of agreeing is a group act, unless at least two people commit it, no one does.” Developments in the Law—Criminal Conspiracy, 72 Harv. L.Rev. 920, 926 (1959). Parties can agree without being aware of one another’s identity. Blumen- thal, 332 U.S. at 557-58, 68 S.Ct. 248. This Court, however, has held that to support a conviction the evidence must at least demonstrate the existence of the unknown co-conspirators and their complicity. See United States v. Cepeda, 768 F.2d 1515, 1517 (2d Cir.1985) (holding that defendant’s intent to enter into conspiracy with others unknown to distribute or possess with intent to distribute cocaine could not be inferred from presence and mere possession of paraphernalia usable in drug cutting).

Here, we have only Wells’s vague statement made contemporaneously -with a single heroin sale. This statement alone is too thin a reed to support the essential element of a conspiracy—the agreement. Cf. United States v. Soto, 716 F.2d 989, 991-92 (2d Cir.1983) (holding that evidence that defendant resided at an apartment occupied by several others and used as a cutting mill, even coupled with knowledge that a crime was being committed there, was insufficient to sustain conspiracy conviction). It merely points to the existence of a potential drug source. The government asks us to permit jurors to draw an inference of an agreement without providing “a logical and convincing connection between the facts established and the conclusion inferred.” United States v. Salmon, 944 F.2d 1106, 1114 (3d Cir.1991). While the statement that Wells made may imply more than one transaction, it gives no specific indication of the exact nature of that transaction or the quantity of drugs involved. Therefore, it would be sheer speculation for jurors to conclude that an agreement to distribute drugs had been made.

Accordingly, we hold that no rational jury could find Wells guilty beyond a reasonable doubt of each element of the narcotics conspiracy offense on the basis of the meager evidence presented. The government has not satisfied the most basic element of the conspiracy charge — to show agreement to distribute drugs between Wells and another person, even if unknown. Therefore, we must reverse the judgment of conviction of the district court as to the finding of guilt on Count One.

B. “Merger” of § 84.1(a)(1) Convictions

Wells next argues that the district court erred as a matter of law by allowing the jury to convict him of both distributing heroin and possessing heroin with the intent to distribute it, both proscribed by 21 U.S.C. § 841(a)(1), as a result of a single sale of heroin. He contends that the two offenses merge into a single distribution of narcotics offense alone when the only quantity of drugs introduced into evidence is the quantity of drugs sold. Accordingly, Wells requests that this court vacate his conviction for possession with intent to distribute narcotics, pursuant to 21 U.S.C. § 841(a)(1).

The government counters that Wells failed to preserve this “merger” claim with requisite specificity in making his Fed.R.Crim.P. 29 motion at the conclusion of proof at trial. In response, Wells urges this Court to find that he forfeited the issue, rather than waived it, entitling him to plain error review pursuant to Fed.R.Crim.P. 52(b). We agree with Wells and thus analyze this aspect of his appeal under the plain error doctrine for the reasons discussed below.

1. Was the merger issue waived?

Wells argues that he did not waive the “merger” issue, but rather forfeited it by failing to make a timely assertion of the “merger” issue at trial. Hence, Wells contends that we may perform plain error review upon the “merger” issue. We agree with Wells. Under Rule 52(b), “error” means “deviation from a legal rule unless the rule has been waived.” United States v. Olano, 507 U.S. 725, 734, 113 S.Ct. 1770, 123 L.Ed.2d 508 (1993). Waiver has been defined as the “intentional relinquishment or abandonment of a known right.” Id. at 733, 113 S.Ct. 1770. Thus, issues not intentionally relinquished or abandoned but nevertheless not raised — that is, forfeited issues— may be reviewed for plain error.

At the conclusion of the proof at trial, Wells, through his legal advisor made “a Rule 29 motion” requesting “a motion for a judgment of acquittal based upon the failure of the government to prove any of the elements that is [sic] alleged in Count One of this indictment, and also the failure of proof in connection with Counts Two and Three [the distribution and possession charges].” Tr. at 414-15. Wells made no argument addressing the alleged legal error involved in charging him with two separate offenses for a single incident. The district court denied the motion as to “Counts 2 and 3” and reserved judgment as to Count One. Id. In addition, after the jury handed down its verdict of guilty on all three counts, the district court advised Wells that he could renew his Rule 29 motions made prior to the verdict, however, he failed to do so. See Tr. at 520. There is no evidence that Wells intentionally chose not to raise the merger issue for strategic reasons or knowingly .and intelligently failed to raise the issue.

We find, therefore, that Wells’s “failure to make the timely assertion” of the “merger” issue at trial amounted to a “forfeiture,” rather than a “waiver.” Olano, 507 U.S. at 733, 113 S.Ct. 1770 (quotation marks and citation omitted). As a result, we may review the district court’s failure to merge Wells’s distribution and possession charges for plain error, pursuant to Rule 52(b). Id.

2. Plain Error Review

Rule 52(b) provides: “Plain errors or defects affecting substantial rights may be noticed although they were not brought to the attention of the court.” Fed.R.Crim.P. 52(b). In Olano, the Supreme Court set out specific limitations on appellate courts’ ability “to correct an error not raised at trial,” holding that “there must be (1) ‘error,’ (2) that is ‘plain,’ and (3) that !affect[s] substantial rights.’ ” Johnson v. United States, 520 U.S. 461, 117 S.Ct. 1544, 1548-49, 137 L.Ed.2d 718 (1997) (quoting Olano, 507 U.S. at 732, 113 S.Ct. 1770). “If all three conditions are met, an appellate court may then exercise its discretion to notice a forfeited error, but only if (4) the error ‘seriously affect[s] the fairness, integrity, or public reputation of judicial proceedings.’ ” Id. (quoting Olano, 507 U.S. at 732, 113 S.Ct. 1770) (internal quotations marks and citations omitted); see United States v. Viola, 35 F.3d 37, 42 (2d Cir.1994) (adopting the Olano limitations), cert. denied, 513 U.S. 1198, 115 S.Ct. 1270, 131 L.Ed.2d 148 (1995). Overall, Wells bears the burden of persuasion on appeal to show that the district court committed plain error. See Viola, 35 F.3d at 41.

a. Was the Error “Plain”?

Wells contends that the “merger” rule for convictions under the provisions of § 841(a)(1) was “plain” under current law at the time of his trial. In support of this contention, Wells relies on decisions from other circuits and dicta from this Court. While we ultimately agree with Wells that some form of the “merger” rule is “plain,” we reach this conclusion through different reasoning reflected in the discussion below.

Under Rule 52(b) an error is “plain” if it is “clear” or “obvious” under current law. Olano, 507 U.S. at 734, 113 S.Ct. 1770; see United States v. Bryan, 122 F.3d 90, 92 (2d Cir.1997). A “plain” error is “an error so egregious and obvious as to make the trial judge and prosecutor derelict in permitting it, despite the defendant’s failure to object.” United States v. Tillem, 906 F.2d 814, 825 (2d Cir.1990).

As a general principle, courts may not mete out multiple punishments for the same criminal conduct unless Congress intended such multiple punishments. See Rutledge v. United States, 517 U.S. 292, 297, 116 S.Ct. 1241, 134 L.Ed.2d 419 (1996); Prince v. United States, 352 U.S. 322, 327-28, 77 S.Ct. 403, 1 L.Ed.2d 370 (1957); United States v. Blockburger, 284 U.S. 299, 304, 52 S.Ct. 180, 76 L.Ed. 306 (1932); United States v. Rosario, 111 F.3d 293, 300-01 (2d Cir.1997). The Supreme Court has made it clear that the Double Jeopardy Clause of the Fifth Amendment “protects against multiple punishments for the same offense,” North Carolina v. Pearce, 395 U.S. 711, 717, 89 S.Ct. 2072, 23 L.Ed.2d 656 (1969), overruled in part on other grounds, Alabama v. Smith, 490 U.S. 794, 795, 109 S.Ct. 2201, 104 L.Ed.2d 865 (1989), and “[i]f a federal court exceeds its own authority by imposing multiple punishments not authorized by Congress, it [also] violates ... the constitutional principle of separation of powers.” Whalen v. United States, 445 U.S. 684, 689, 100 S.Ct. 1432, 63 L.Ed.2d 715 (1980).

This Court, however, has not yet addressed directly the application of Double Jeopardy in circumstances where multiple punishments have been prescribed for distribution of a controlled substance and possession with intent to distribute a controlled substance, both in violation of § 841(a)(1), based on evidence of only.one transaction. However, other circuits have uniformly decided that convicting a defendant for distribution of a controlled substance and possession with intent to distribute that substance arising from the same single transaction with no other evidence of a separate possession constitutes double punishment unintended by Congress. See United States v. Rodriguez-Cardona, 924 F.2d 1148, 1159 (1st Cir.1991); United States v. Carcaise, 763 F.2d 1328, 1333 (11th Cir.1985); United States v. Gomez, 593 F.2d 210, 213 (3d Cir.1979) (en banc); United States v. Hernandez, 591 F.2d 1019, 1021-22 (5th Cir.1979) (en banc) (binding on the 11th Circuit as well); United States v. Oropeza, 564 F.2d 316, 323-24 (9th Cir.1977); United States v. Olivas, 558 F.2d 1366, 1368 (10th Cir.1977); United States v. Stevens, 521 F.2d 334, 336-37 (6th Cir.1975); United States v. Curry, 512 F.2d 1299, 1305 (4th Cir.1975) (relying on United States v. Atkinson, 512 F.2d 1235 (4th Cir.1975)). The lack of this Court’s precedent directly on point with regard to “merger” under § 841(a)(1) will not prevent the district court’s error from being deemed “plain” because this is not a case in which we have taken no position on a certain issue and upon which no other circuit has spoken, see United States v. Jennings, 12 F.3d 836, 838 (8th Cir.1994), or upon which there is a sharp dispute among the other circuits. See United States v. Ingenito, 531 F.2d 1174, 1175-76 (2d Cir.1976). Moreover, in a different context, we have found that multiple convictions for the same offense are cognizable under plain error doctrine. United States v. Coiro, 922 F.2d 1008, 1013-15 (2d Cir.1991).

Thus, we hold that in light of the above-described Supreme Court and Circuit Court law at the time of Wells’s conviction, the conviction and sentencing of an individual for both distribution and possession with intent to distribute arising from the same transaction with no additional evidence of a separate drug quantity violates double jeopardy principles. As we have not yet had the opportunity to discuss the application of these precedents to the circumstances presented by this case, we do so here.

To determine “whether convictions under separate sections of the federal criminal law arising from the defendant’s involvement in a single event or a common series of events violate double jeopardy principles,” we analyze the following three factors: “the language of the statutes, how those statutes fare under the Blockburger test, and express congressional intent, if any, on the issue of multiple punishments.” United States v. Muhammad, 824 F.2d 214, 218 (2d Cir.1987). Under Blockburger, we determine “whether there are two offenses or only one [by] whether each provision requires proof of a fact which the other does not.” 284 U.S. at 304, 52 S.Ct. 180.

When applying the Blockburger test, the Supreme Court has “often concluded that two different statutes define the ‘same offense,’ typically because one is a lesser included offense of the other.” Rutledge, 517 U.S. at 297, 116 S.Ct. 1241. For instance, the Supreme Court has concluded that multiple convictions are barred where separate statutes directed at “receipt” and “possession” of a firearm amounted to the “same offense,” in that proof of receipt “necessarily” included proof of possession. Ball v. United States, 470 U.S. 856, 861-64, 105 S.Ct. 1668, 84 L.Ed.2d 740 (1985). The Blockburger test is primarily a means of “identifying congressional intent to impose separate sanctions for multiple offenses arising in the course of a single act or transaction.” Iannelli v. United States, 420 U.S. 770, 785 n. 17, 95 S.Ct. 1284, 43 L.Ed.2d 616 (1975); see Muhammad, 824 F.2d at 217.

Despite the prominence of Blockburger in settling questions of multiple punishment for the “same offense,” almost none of the circuit-level cases addressing multiple punishments under § 841(a)(1) have employed the Blockburger test. See United States v. Sepulveda, 102 F.3d 1313, 1317 (1st Cir.1996) (finding that “satisfying Blockburger has not wholly satisfied the circuit courts”). For instance, in Hernandez, the Fifth Circuit found Blockburger inapplicable to multiple punishments under § 841(a)(1) because it reasoned that “[w]e do not deal here with the violation of separate statutes ... but with the interpretation of two phrases in one sentence of a single law.” 591 F.2d at 1022 n. 9. But see Carcaise, 763 F.2d at 1333 & n. 18 (noting that Hernandez relied on the Fifth Circuit’s “different evidence” test which “takes its lead from ... Blockburger ”). We concede that Blockburger has not been applied yet by the Supreme Court in cases such as this because we are not comparing “separate statutes,” see, e.g., Albernaz v. United States, 450 U.S. 333, 336, 101 S.Ct. 1137, 67 L.Ed.2d 275 (1981), but rather we compare separate clauses within the same statute.

Nonetheless, in United States v. Miller, 870 F.2d 1067 (6th Cir.1989), the Sixth Circuit found that the Blockburger test applied to separate convictions under. § 841(a)(1) for manufacture of a controlled substance and the possession with the intent to distribute that substance because Blockburger is a test of statutory construction used to reveal whether Congress intended separate offenses and the offenses listed under § 841(a)(1) are proscribed in the disjunctive—separated by the word “or”—“thus permitting prosecution where either occurs.” Id. at 1071; see United States v. Zamora, 784 F.2d 1025 (10th Cir.1986) (same). We agree with the Sixth Circuit that the disjunctive list of conduct proscribed under § 841(a)(1) is subject to the Blockburger “same offense” test. The First Circuit appears to be the only circuit court to apply the Blockburger test to multiple punishments arising out of possession with intent to distribute and distribution under § 841(a)(1), see Sepulveda, 102 F.3d at 1316, finding that they were not the “same offense” based on the following reasoning:

It is possible — albeit unusual — to be guilty of distribution of a drug without also possessing it with intent to distribute. Someone who participates in a drug transaction — e.g., as a broker or armed guard— can be liable for distribution without ever possessing the drugs_ While “possession” is certainly helpful in proving distribution, it is technically not a necessary element.

102 F.3d at 1317 (citations omitted).

Generally, we agree with the First Circuit that possession with intent to distribute and distribution under § 841(a)(1) may involve “proof of a fact that the other does not,” satisfying the Blockburger test for different offenses for which cumulative punishment may be prescribed. However, we' think that greater explanation is required. Under Blockburger, we must compare the respective elements of possession with intent to distribute and distribution. Albernaz, 450 U.S. at 338, 101 S.Ct. 1137. To be liable for distribution, the government must prove beyond a reasonable doubt that the defendant “knowingly and intentionally ... distributed ... a controlled substance.” 21 U.S.C. § 841(a)(1). Further, the essential elements of the crime of possession are that the defendant: (1) knowingly (2) possessed, a controlled substance (3) with a specific intent to distribute it. See United States v. Martinez, 44 F.3d 148, 151 (2d Cir.1995).

We agree with the First Circuit that “the offense of distribution obviously does require an element not required for the crime of possession with intent, namely, the act of distribution.” Sepulveda, 102 F.3d at 1316. Further, some circuit-level courts have held the “proof of distribution does not necessarily include the element of possession.” United States v. Tejada, 886 F.2d 483, 490 (1st Cir.1989); see United States v. Brunty, 701 F.2d 1375, 1381 & n. 16 (11th Cir.1983); United States v. Winston, 687 F.2d 832, 834 (6th Cir.1982); United States v. Nelson, 563 F.2d 928, 931 (8th Cir.1977); United States v. Jackson, 526 F.2d 1236 (5th Cir.1976). These cases suggest that “[t]he possession with intent charge can be proved without proof of actual distribution and the distribution charge can conceivably be proved without proof of possession,” Stevens, 521 F.2d at 337 n. 2, which would satisfy the Blockburger test because “each [clause] requires' proof of a fact which the other does not.” 284 U.S. at 304, 52 S.Ct. 180.

Despite finding that the possession with intent and distribution clauses of § 841(a)(1) satisfied Blockburger, the First Circuit adhered to the principle that “ ‘the offenses merge where the distribution itself is the sole evidence of possession, or where possession is shown to exist only at the moment of distribution.’” Sepulveda, 102 F.3d at 1317 (quoting United States v. Rodriguez-Cardona, 924 F.2d 1148, 1159 (1st Cir.1991)). As a corollary, those circuits adhering to the “merger” rule have permitted separate convictions and sentences under § 841(a)(1) for both possession with intent and distribution “when there is independent evidence of the defendant’s prior possession of the controlled substance before the actual time of distribution, or where there is other separate evidence for each offense.” United States v. Zabaneh, 837 F.2d 1249, 1257 (5th Cir.1988); see United States v. Tejada, 886 F.2d at 490 (same); see also Carcaise, 763 F.2d at 1333 (holding that “[wjhere there is separate evidence of the two offenses, the offenses have not merged”).

Therefore, under Blockburger, we conclude that generally possession with intent and distribution should not be regarded as the “same offense,” nor is possession with intent always to be deemed a lesser included offense of distribution, see, e.g., Gomez, 593 F.2d at 213, except under the narrow set of facts “[wjhere the evidence shows only that the defendant handed over a packet of drugs.” Sepulveda, 102 F.3d at 1317. In that limited scenario, the convictions for both possession with intent and distribution will fail the Blockburger test because no longer does each offense require proof of a fact that the other does not.

Rather than relying on Blockburger, almost all of the circuit-level cases addressing the “merger” of possession with intent and distribution under § 841(a)(1)'look to the Supreme Court’s holding in Prince, a decision which addresses offense conduct listed in separate clauses within the same statute. See, e.g., Stevens, 521 F.2d at 337; see also Gomez, 593 F.2d at 213, Olivas, 558 F.2d at 1367; Oropeza, 564 F.2d at 323-24. In Prince, a defendant was convicted of two separate aspects of the Federal Bank Robbery Act, 18 U.S.C. § 2113(a): bank robbery and entry of a bank with intent to commit a robbery. The Supreme Court found that Congress meant to include the lesser offenses, such as entry of a bank with intent to commit robbery, but did not intend to “pyramid penalties” under § 2113. Prince, 352 U.S. at 327, 77 S.Ct. 403. Conviction of entry of a bank with intent to commit robbery involves entering a bank with the intent to steal but being frustrated before the robbery is consummated. Id. at 328, 77 S.Ct. 403. As a result, the Supreme Court held that the “mental element merges into the completed crime” and remanded for resen-tencing without addressing or disturbing the two separate convictions.

Relying on the authority of Prince, we have held that the conviction of possession of a controlled substance aboard an aircraft, in violation of 21 U.S.C. § 955, merges with the conviction of illegal importation, in violation of 21 U.S.C. § 952(a), once the latter offense has been committed. United States v. Valot, 481 F.2d 22, 27 (2d Cir.1973). In Valot, we arrived at this conclusion because “it would seem that the only reason that Congress enacted Section 955 was to cover eases involving unconsummated importation of a controlled substance.” Id.

By the same logic utilized in Valot, we conclude that Congress, in enacting § 401 of the Comprehensive Drug Abuse Prevention and Control Act of 1970 (the “Act”)(codified as 21 U.S.C. § 841(a)(1)) sought to penalize possession with intent to distribute in cases involving an unconsummated distribution of a controlled substance. See United States v. Mendoza, 902 F.2d 693, 697 (8th Cir.1990) (§ 841(a) “is structured in such a way as to criminalize successive stages of a single undertaking”); Gomez, 593 F.2d at 213 (finding that by entitling the Act “Comprehensive” “that Congress, in legislating against drug use, intended to encompass every act and activity which could lead to the proliferation of drug traffic”). We reach this conclusion in part based on our finding that the legislative history of § 841(a) “quite clearly indicates that Congress specifically intended that distributors of drugs should be severely dealt with,” Valot, 481 F.2d at 27, more so than mere possessors. Id. at 27 n. 3; See H.R.Rep. No. 91-1444 (1970), reprinted in 1970 U.S.C.C.A.N. 4566, 4576, 4614-18. Accordingly, we have consistently held that simple possession, in violation of 21 U.S.C. § 844, is a lesser included offense of possession with intent to distribute, see, e.g., United States v. Garcia-Duarte, 718 F.2d 42, 47 (2d Cir.1983), although we have been less consistent in determining whether mere possession is a lesser included offense of distribution. Compare United States v. Beverly, 562 F.2d 201, 204 (2d Cir.1977) with United States v. Welbeck, 145 F.3d 493, 498-99 (dicta). Regardless, it follows that possession with intent to distribute merges with distribution “where the distribution itself is the sole evidence of possession, or where possession is shown to exist only at the moment of distribution.” Rodriguez-Cardona, 924 F.2d at 1159; see also Zabaneh, 837 F.2d at 1257. Therefore, we agree with the Eighth Circuit that “[al]though a defendant can be charged with multiple offenses under [§ 841(a)(1)],” based on our finding that the clauses under § 841(a)(1) satisfy the Blockburger test for separate offenses, “Congress did not intend for a defendant to be cumulatively punished for two or more offenses based on the same act.” Mendoza, 902 F.2d at 697; see Gomez, 593 F.2d at 213.

In the instant case, Wells was convicted of separate counts of possession with intent to distribute and distribution based on a single sale of heroin to the government informant, Taft. There was no evidence demonstrating additional quantities of drugs in Wells’s possession at that time or at any point in time before the distribution to Taft. Therefore, this is the paradigmatic case where possession with intent to distribute merges into distribution.

In sum, we hold that the rules laid out in Rutledge, Prince, the uniform holdings of our sister circuits, and our own line of cases are relevant precedents with a sufficient level of specific applicability to the facts of this case so that the district court’s error was clear and obvious under the law at the time of Wells’s trial.

b. Did the error affect “substantial rights”?

We turn next to the question of whether the error affected Wells’s substantial rights. Wells contends that the error was determinative to the outcome of his case because had the error not been made; he would not have been convicted of and sentenced for possession with intent to distribute. We agree.

Any plain error must have affected the appellant’s “substantial rights,” meaning the error must have been prejudicial and it must have affected the outcome of the district court proceedings. Olano, 507 U.S. at 734-35, 113 S.Ct. 1770. Though “prejudice” is also required to show that an error is not “harmless,” pursuant to Fed.R.Crim.P. 52(a), the important difference of plain error prejudice is that “[i]t is the defendant rather than the Government who bears the burden of persuasion with respect to prejudice.” Id.

Generally, as in this ease, the sentence for possession with intent to distribute runs concurrently with the sentence for distribution. See, e.g., Beverly, 562 F.2d at 204. In Rutledge, the Supreme Court noted that as long as 18 U.S.C. § 3013 provides for a $50 special assessment on each conviction, a second conviction will always amount to a second punishment. 517 U.S. at 301, 116 S.Ct. 1241. The High Court also pointed out that a second conviction has. other collateral consequences, such as its effect .on parole eligibilh ty, its possible future use under a recidivist statute, and social stigma. Id. at 301-03, 116 S.Ct. 1241; see also Ball, 470 U.S. at 856, 105 S.Ct. 1668. “Thus, the second conviction, even if it results in no greater sentence is an impermissible punishment.” Rutledge, 517 U.S. at 301, 116 S.Ct. 1241.

Most pr e-Ball and pre-Rutledge circuit-level cases holding that possession with intent merges with distribution for the same act vacated the second sentence (for possession with intent) but allowed the second conviction to stand, relying in part on Prince. See, e.g., Stevens, 521 F.2d at 337. In both Ball and Rutledge, however, the High Court reversed the second conviction as well as the second sentence, without explicitly addressing this aspect of Prince. 470 U.S. at 856, 105 S.Ct. 1668, 517 U.S. at 303, 116 S.Ct. 1241. Based on Ball and Rutledge, we believe that both the second conviction and the second sentence should be vacated in this case. See Mendoza, 902 F.2d at 697-98 (holding that after Ball, “the view that a second conviction by itself [for possession with intent, to distribute] is harmless has been rejected”); Sepulveda, 102 F.3d at 1317; see also Rosario, 111 F.3d 293, 300-01 (2d Cir.1997) (holding that, under Rutledge, second conviction must be vacated even if no additional sentence is imposed).

Here, Wells was sentenced to three concurrent terms of 27 months’ imprisonment for each of his three convictions, the low-end of the range for a total offense level of 12. The district court arrived at this sentence by grouping Wells’s three narcotics-related offenses together pursuant to U.S.S.G. § 2D1.1 and calculating an offense level of 12 based on the aggregate quantity of drugs. Wells was also assessed $150 special assessment, $50 for each conviction.

According to the Rutledge decision, the district court erred by meting out multiple punishments — in the form of two convictions and two special assessments — unauthorized by Congress for the same offense. Therefore, Wells has met his burden to show that the -error of the district court affected his substantial rights because he is correct in asserting that he was prejudiced by being convicted of and sentenced for both possession with intent to distribute and distribution arising from the same transaction.

• c. Did the error “seriously affect the fairness, integrity and public reputation of judicial proceedings”?

Lastly, we consider whether we ought to exercise our discretion to reach the merger issue because Wells’s erroneous conviction will affect the “fairness, integrity and public reputation of judicial proceedings.” Olano, 507 U.S. at 732, 113 S.Ct. 1770. When “a federal court exceeds its own authority by imposing punishments not authorized by Congress, it violates not only the specific guarantee against double jeopardy, but also the constitutional principle of separation of powers in a manner that trenches particularly harshly on individual liberty.” Whalen, 445 U.S. at 689, 100 S.Ct. 1432. In addition, “[ejrrors of constitutional dimension will be noticed more fi’eely than less serious errors.” 3A C. Wright, Federal Practice and Procedure: Criminal 2d § 856, at 342 (2d ed.1982); see United States v. Lopez, 414 F.2d 272, 273 (2d Cir.1969) (per curiam). Furthermore, we have noted that this Court will more likely find plain error where, as here, the relief sought has less impact on the judicial system than the granting of a new trial. See United States v. Leung, 40 F.3d 577, 586 n. 2 (2d Cir.1994). Based on the foregoing, we thus decide to exercise our discretion to address the plain error raised by Wells’s appeal because to do otherwise would negatively impact the fairness, integrity and public reputation of judicial proceedings. In conclusion, we find that Wells has met the stringent four-part test for plain error and, therefore, his conviction for possession with intent to distribute heroin, in violation of § 841(a)(1), must be vacated.

C. Spillover Prejudice

Wells argues that he must receive a new trial on any remaining, non-conspiracy counts because of the spillover prejudice from his invalidated conspiracy conviction. In particular, Wells points to the government’s introduction of evidence, supporting the conspiracy charge, concerning the entire police investigation into heroin distribution in Albany. Wells contends that this evidence would not have been admitted had he been tried on simple narcotics charges. In addition, Wells claims that the conspiracy evidence was so similar to the evidence on the other counts that it would have caused prejudice, relying on United States v. Rooney, 37 F.3d 847, 855-57 (2d Cir.1994). The Government counters that this type of background information on the nature and type of police investigation that led to Wells’s arrest would have been admissible even if he had not been charged with conspiracy, and thus, caused him no prejudice.

“When an appellate court reverses some but not all counts of a multicount indictment, the court must determine if prejudicial spillover from evidence introduced in support of the reversed count requires the remaining convictions to be upset.” Rooney, 37 F.3d at 855. In evaluating a claim of prejudicial spillover of evidence from an invalidated count, this Court looks to several factors: (1) whether the evidence from the invalidated count would have incited or aroused the jury to convict the defendant on the remaining counts; (2) whether the reversed and remaining counts arose out of similar facts, the evidence of which would have been admissible as to both; (3) whether the evidence on the reversed and on the remaining counts was completely dissimilar, permitting the inference that the jurors were able to keep the evidence separate; and (4) whether the strength of the government’s case on remaining counts could withstand the potential spillover prejudice. Id. at 855-56.

Here, there was so little evidence at trial of any conspiracy that its prejudicial spillover would be unlikely to incite the jury to convict on the remaining counts. On the contrary, the strength of the government’s case on the remaining counts may have swayed the jury’s verdict on the weaker conspiracy count. Further, the evidence introduced on the conspiracy count concerning the nature and type of police investigation that led to Wells’s arrest would have been admissible as “background evidence” in a trial for the narcotics sale alone. See United States v. Regan, 103 F.3d 1072, 1082-83 (2d Cir.1996) (holding that allegations of misconduct made by certain witnesses were not offered for their truth, but simply as “background evidence” of the events that led to the investigation which resulted in defendant’s arrest). Thus, no new trial is warranted in this case as we find no spillover prejudice from the minimal background and cumulative evidence introduced with regard to the invalidated convictions.

III. CONCLUSION

We vacate defendant’s judgment of conviction on Counts 1 and 18, as charged in Superseding Indictment. We affirm the defendant’s judgment of conviction on Count 47 for distribution of heroin, in violation of 21 U.S.C § 841(a)(1). Finally, we remand for resen-tencing based on a conviction of Count 47 alone. 
      
      . Other Circuits have uniformly held that the mere relationship between a buyer and seller of drugs is not sufficient to show a conspiratorial agreement under § 846. United States v. Mims, 92 F.3d 461, 465 (7th Cir.1996) (holding buyer-seller relationship insufficient to show conspiracy "even where the buyer intends to resell the purchased narcotics”); United States v. Morris, 836 F.2d 1371, 1373-74 (D.C.Cir.1988); United States v. McIntyre, 836 F.2d 467, 471 (10th Cir.1987); United States v. Burroughs, 830 F.2d 1574, 1580-81 (11th Cir.1987); United States v. Lennick, 18 F.3d 814, 819 n. 4 (9th Cir.1994); United States v. Meyers, 646 F.2d 1142, 1145 (6th Cir.1981).
     
      
      . 21 U.S.C. § 841(a) provides: ‘‘[I]t shall be unlawful for any person knowingly or intentionally — (1) to manufacture, distribute, or dispense, or possess with intent to manufacture, distribute, or dispense, a controlled substance.”
     
      
      . We note that in United States v. Young, 745 F.2d 733 (2d Cir.1984), a case in which the appellant raised the same "merger” argument with respect to § 841(a)(1), this Court did not reach the issue, holding that the defendant had waived his right to appellate review by failing to oppose the jury instructions with sufficient clarity to alert the trial judge that he was renewing his earlier and explicit "merger” challenge thereto. Id. at 753-54. Young is distinguishable on its facts and on the law. First, at trial, Wells never raised the issue of merger of the two offenses with any specificity comparable to the defendant in Young. Thus, Wells cannot be found to have knowingly and intelligently waived his "merger" claim. Second, the panel in Young made no plain error analysis of the "merger” claim.- It is questionable whether we would have resolved Young in the same manner today in light of the Supreme Court’s subsequent holding in Olano. 507 U.S. at 733, 113 S.Ct. 1770; see United States v. Yu-Leung, 51 F.3d 1116, 1121—22 (2d Cir.1995). Furthermore, we do not believe that Wells can be deemed to have waived this claim under Fed.R.Crim.P. 12(b) & (f). See Gabriel J. Chin, Double Jeopardy Violations as “Plain Error” Under Federal Rule of Criminal Procedure 52(b), 21 Pepp. L.Rev. 1161, 188 & n. 169 (1994). Therefore, Wells's claim, being forfeited, is an "error” properly reviewable under Rule 52(b).
     
      
      . Two of our cases have addressed the "merger" issue in dicta. In United States v. Jones, 779 F.2d 121 (2d Cir.1985), a case involving convictions for both distribution of heroin and possession of heroin with intent to distribute it, we noted in dicta that there was no merger problem "since the possession count related to heroin other than that involved in the sale count.” Id. at 122.
      Wells contends that we should look to our dicta in Young, 745 F.2d at 754, for guidance in determining the "merger” issue. Like Wells, the defendant in Young argued that he could not be convicted, pursuant to 21 U.S.C. § 841(a)(1), for both possession with intent to distribute narcotics and distribution of narcotics based on the same offense conduct. Id. at 753. We declined to reach the merits of Young's claims, but noted:
      Were we forced to resolve this issue today in the context of this case, a close question would be presented, for even under the prevailing view, it is debatable whether there was sufficient evidence ... apart from the actual distribution to support a conviction for possession with intent to distribute as well as one for actual distribution.
      
        Id. at 754. While the language in Young is somewhat persuasive, standing alone, it is obviously insufficient to amount to a rule of law that is “plain.”
     
      
      . We note that in United States v. Welbeck, 145 F.3d 493 (2d Cir.1998), this Court stated: "[p]os-session of crack cocaine is a lesser included offense of distribution in the sense that all of the elements necessary to a conviction for possession are also necessary to a conviction for distribution.” Id. 145 F.3d at 498-99. However, there was no discussion of this statement, no citation to any supporting authority, and we believe that this statement was merely dicta. In Welbeck, the defendant was charged with possession with intent to distribute cocaine base, in violation of 21 U.S.C. § 841(a)(1). Over the defendant's objection, the district court gave a lesser included offense instruction for the crime of mere possession, in violation of 21 U.S.C. § 844. As a result, the jury convicted the defendant of the lesser charge of possession. Upon appeal, the defendant claimed that the district court should have followed the Probation Department's recommendation that he receive a downward departure because he was entitled to a lesser penalty for conviction of the lesser charge. He objected because he would have been subject to the same sentence that he actually received for possession had he been convicted of the " 'greater charge' of distribution.” Id. It is unclear whether the court and the defendant refer to “distribution” as shorthand for the actual "greater charge” in that case: possession with intent to distribute. In addition, the panel in Welbeck based its holding, not on the above-quoted statement, but on the broad discretion of the district court not to depart and the mere advisory weight of Probation Department reports. Id. Finally, Welbeck was not a case in which the elements of the distribution of a controlled substance were discussed in detail in order to assess the applicability of Blockburger. Thus, we find that Welbeck provides no binding precedent for our analysis of whether the offense of distribution of a controlled substance necessarily includes the element of possession.
     