
    IN RE: ACCUTANE LITIGATION
    A-25 September Term 2017 079958
    Supreme Court of New Jersey.
    Argued April 23, 2018 Decided August 1, 2018
    Paul W. Schmidt (Covington & Burling) of the District of Columbia bar, admitted pro hac vice, argued the cause for appellants Hoffmann-La Roche Inc. and Roche Laboratories Inc. (Gibbons, Dughi Hewit & Domalewski, Covington & Burling, and Peabody & Arnold, attorneys; Natalie H. Mantell, Newark, Russell L. Hewit, Cranford, Paul W. Schmidt, Michael X. Imbroscio (Covington & Burling) of the District of Columbia bar, admitted pro hac vice, and Colleen M. Hennessey (Peabody & Arnold) of the Massachusetts bar, admitted pro hac vice, of counsel and on the briefs).
    Bruce D. Greenberg, Newark and David R. Buchanan argued the cause for respondents Craig Abernethy, et al. (Lite DePalma Greenberg, Seeger Weiss, Weitz & Luxenberg, and Beggs & Lane, attorneys; Bruce D. Greenberg, Newark, David R. Buchanan, Peter Samberg, and Mary Jane Bass (Beggs & Lane) of the Florida bar, admitted pro hac vice, on the briefs).
    Edward J. Fanning, Jr., Newark, argued the cause for amici curiae HealthCare Institute of New Jersey, New Jersey Business & Industry Association, Commerce and Industry Association of New Jersey, and New Jersey Chamber of Commerce (McCarter & English, attorneys; Edward J. Fanning, Jr., and David R. Kott, Newark of counsel and on the brief, and Gary R. Tulp, on the brief).
    Christopher M. Placitella argued the cause for amicus curiae New Jersey Association for Justice (Cohen, Placitella & Roth, attorneys; Christopher M. Placitella and Jared M. Placitella, of counsel and on the brief).
    Allan Kanner submitted a brief on behalf of amicus curiae Allan Kanner, Esquire (Kanner & Whiteley, attorneys).
    Diana C. Manning submitted a brief on behalf of amicus curiae DRI - The Voice of the Defense Bar (Bressler, Amery & Ross, Plunkett Cooney, and DRI - The Voice of the Defense Bar, attorneys; Diana C. Manning, Florham Park, Mary Massaron (Plunkett Cooney) of the Michigan and New York bars, admitted pro hac vice, Hilary A. Ballentine (Plunkett Cooney) of the Michigan bar, admitted pro hac vice, and John F. Kuppens (DRI - The Voice of the Defense Bar) of the South Carolina bar, admitted pro hac vice, on the brief).
    Susan J. Kraham submitted a brief on behalf of amicus curiae Ironbound Community Corporation (Morningside Heights Legal Services, attorneys; Susan J. Kraham and Edward Lloyd, on the brief).
    Shalom D. Stone, Roseland submitted a brief on behalf of amicus curiae New Jersey Civil Justice Institute (Stone Conroy, attorneys).
    Melinda Martinson submitted a brief on behalf of amici curiae American Medical Association, Medical Society of New Jersey, American Academy of Dermatology, Society for Investigative Dermatology, American Acne and Rosacea Society, and Dermatological Society of New Jersey (Medical Society of New Jersey and Powers Pyles Sutter & Verville, attorneys; Melinda Martinson, and Ronald S. Connelly (Powers Pyles Sutter & Verville) of the District of Columbia and Maryland bars, admitted pro hac vice, on the brief).
    Gavin J. Rooney submitted a brief on behalf of amici curiae Kenneth S. Broun, Daniel J. Capra, Joanne A. Epps, David L. Faigman, Laird Kirkpatrick, Michael M. Martin, Liesa Richter, and Stephen A. Saltzburg (Lowenstein Sandler, attorneys).
    Gregory S. Chernack submitted a letter brief on behalf of amicus curiae Pharmaceutical Research and Manufacturers of America (Hollingsworth, attorneys).
   JUSTICE LaVECCHIA delivered the opinion of the Court.

At issue in this appeal involving a civil mass tort action is the admissibility of scientific evidence under the New Jersey Rules of Evidence.

Plaintiffs claim that a causal connection exists between Accutane, a prescription drug used in the treatment of nodular acne, and Crohn's disease, a chronic gastrointestinal illness. Litigation in New Jersey over Accutane's side effects has spanned more than a decade. This action is a continuation in that series of litigated matters. Since those actions first commenced in New Jersey in 2005, a number of epidemiological studies have been published, all concluding that there is no causal relationship between Accutane and Crohn's disease. Plaintiffs' experts dispute the conclusions of those studies, calling them flawed and lacking in value. Having rejected the evidence and conclusions of those epidemiological studies, one of plaintiffs' experts, relying on other facts and forms of data, asserts the contrary view that Accutane can in fact cause Crohn's disease. Defendants challenged the methodology used by both of plaintiffs' experts as unreliable and sought the exclusion of that expert testimony.

After a Rule 104 pretrial evidentiary hearing, the trial court excluded the testimony, holding that plaintiffs' experts' methodology was unsound because they did not interpret the relevant data and apply it to the facts of this case as would other experts in the field. The Appellate Division reversed, concluding that plaintiffs' experts employed a sound methodology and simply interpreted the data differently than defendants' experts. Our Court was among the foremost to shift from exclusive reliance on a "general acceptance" standard for testing the reliability of scientific expert testimony to a methodology-based approach. See Landrigan v. Celotex Corp., 127 N.J. 404, 414, 605 A.2d 1079 (1992) ; Rubanick v. Witco Chem. Corp., 125 N.J. 421, 447, 593 A.2d 733 (1991). We initially took that step to allow the parties in toxic tort civil matters to present novel scientific evidence of causation if, after the trial court engages in rigorous gatekeeping when reviewing for reliability, the proponent persuades the court of the soundness of the expert's reasoning and methodology. Two years later, with its Daubert decision, the United States Supreme Court also abandoned the general acceptance test in favor of a methodology-based approach that entrusted trial courts with the role of gatekeeper. Both our civil standard and the federal standard moved in the same direction and towards the same common goal. Although the two standards are similar both in practice and in overall philosophy, we have never adopted Daubert or incorporated the factors identified in Daubert for use by our courts when performing the gatekeeper role.

We granted certification in this matter to address whether the trial court properly excluded plaintiffs' experts' testimony, whether the Appellate Division employed the correct standard in reviewing and overturning that decision, and whether our standard for assessing the reliability of expert witnesses is in need of clarification. It is with regard to the last issue that we are asked whether the Daubert standard's factors would further elucidate our own standard for the admissibility of expert testimony. We believe that they would.

We perceive little distinction between Daubert's principles regarding expert testimony and our own, and believe that its factors for assessing the reliability of expert testimony will aid our trial courts in their role as the gatekeeper of scientific expert testimony in civil cases. Accordingly, we now reconcile our standard under N.J.R.E. 702, and relatedly N.J.R.E. 703, with the federal Daubert standard to incorporate its factors for civil cases.

This case -- with its adversarial setting and full record -- provides the appropriate setting for illustrating how courts should evaluate the methodology of a credentialed expert when determining whether an opinion is based on scientifically sound reasoning. See Kemp ex rel. Wright v. State, 174 N.J. 412, 427, 809 A.2d 77 (2002). Our analysis of this record leads to a clear result: the trial court properly excluded plaintiffs' experts' testimony. Moreover, we reaffirm that the abuse of discretion standard must be applied by an appellate court assessing whether a trial court has properly admitted or excluded expert scientific testimony in a civil case. In this matter, the trial court did not abuse its discretion in its evidential ruling and, therefore, the Appellate Division erred in reversing the trial court's exclusion of the testimony of plaintiffs' experts.

I.

Before diving into the record and its contested scientific evidence, we set forth some basic background to the evidential standards in issue.

New Jersey Rules of Evidence 702 and 703 control the admission of expert testimony. N.J.R.E. 702 provides that "[i]f scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education may testify thereto in the form of an opinion or otherwise." N.J.R.E. 703 states that

[t]he facts or data in the particular case upon which an expert bases an opinion or inference may be those perceived by or made known to the expert at or before the hearing. If of a type reasonably relied upon by experts in the particular field in forming opinions or inferences upon the subject, the facts or data need not be admissible in evidence.

In State v. Kelly, this Court applied Rule 702's similarly worded predecessor, Evidence Rule 56, and identified three prerequisites to a determination that expert testimony is permissible:

(1) the intended testimony must concern a subject matter that is beyond the ken of the average juror; (2) the field testified to must be at a state of the art such that an expert's testimony could be sufficiently reliable; and (3) the witness must have sufficient expertise to offer the intended testimony.
[ 97 N.J. 178, 223, 478 A.2d 364 (1984).]

That standard provides the baseline for the admissibility of expert testimony. See Official Comments to N.J.R.E. 702 (noting that N.J.R.E. 702 incorporates standard articulated by Kelly ).

The Kelly criteria elucidated application of the then-applicable "general acceptance" standard for admitting scientific evidence, which originated in Frye v. United States, 293 F. 1013 (D.C. Cir. 1923). In Frye, the then Court of Appeals for the District of Columbia excluded an expert's testimony concerning results from an early form of lie-detector test. Id. at 1013-14 ("[W]hile courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle or discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the particular field in which it belongs."). For many years, the majority of state and federal jurisdictions, including New Jersey, adhered to the "general acceptance" standard first put forth in Frye. See Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 585-87, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993) (observing that Frye had been "the dominant standard for determining the admissibility of novel scientific evidence" for over seventy years). Then, in 1991, prior to the United States Supreme Court's seminal Daubert decision, in which the Court interpreted the Federal Rules of Evidence on expert testimony, our Court moved away from rigid adherence to the general acceptance standard.

This Court held in Rubanick that a court may admit expert scientific evidence on a causation theory in toxic tort litigation so long as "it is based on a sound, adequately-founded scientific methodology involving data and information of the type reasonably relied on by experts in the scientific field." 125 N.J. at 449, 593 A.2d 733. One year later, in Landrigan, we reinforced that in toxic tort matters involving novel theories of causation the trial court is obliged to review data and studies relied on by experts proffering an opinion in order to "determine whether the expert's opinion is derived from a sound and well-founded methodology that is supported by some expert consensus in the appropriate field." 127 N.J. at 417, 605 A.2d 1079. In Kemp, we expanded Rubanick to all novel medical causation circumstances and solidified the requirement of a pretrial Rule 104 hearing for assessing an expert's testimony. 174 N.J. at 430, 809 A.2d 77. Presently, a Kemp hearing is a common pretrial occurrence for resolving the reliability of expert scientific testimony.

A Kemp hearing provides the record for the present matter. After the conclusion of that hearing, the trial court determined that the contested evidence did not pass muster under our Rubanick evidentiary standard for assessing the reliability of proffered expert scientific testimony. The court made its findings in response to argument by the parties, rejecting the soundness of plaintiffs' experts' methodology.

II.

A.

Accutane is a prescription medication developed by defendants Hoffman-La Roche Inc. and Roche Laboratories Inc. (defendants) and approved by the FDA in 1982 to treat recalcitrant nodular acne. Known chemically as isotretinoin, Accutane is part of a family of Vitamin A derivatives called retinoids. During the pre-approval clinical studies of Accutane, roughly a fifth of patients suffered some form of gastrointestinal side effects. Defendants also learned, after Accutane was on the market, that users were suffering symptoms of gastrointestinal upset such as inflammatory bowel disease (IBD) and peptic ulceration. Defendants eventually amended Accutane's warning label in 1984 to add that Accutane was "temporally associated with inflammatory bowel disease."

Accutane's alleged role as a cause of gastrointestinal disease ultimately resulted in a series of lawsuits against defendants. On May 2, 2005, this Court designated all pending and future New Jersey actions involving Accutane as a mass tort Multicounty Litigation (MCL) pursuant to Rule 4:38A. All cases involving Accutane were subsequently transferred to Atlantic County to be heard on a coordinated basis.

The present matter, the latest in a series of cases, involves over two thousand plaintiffs who allege that they developed Crohn's disease as a result of taking Accutane. The following basic information is not a matter of dispute between the parties. Crohn's disease is a form of IBD characterized by inflammation and ulcers in the digestive tract that can result in abdominal pain and other complications. Crohn's disease is distinguished from ulcerative colitis, the other form of IBD, by its ability to appear in any portion of the digestive tract, such as the esophagus and small intestine, whereas ulcerative colitis appears only in the large intestine. The scientific community appears to agree that Crohn's disease and ulcerative colitis are subject to slightly different risk factors. The exact cause of Crohn's disease is unknown.

In the years since many of the earlier cases regarding Accutane and IBD were decided, a series of epidemiological studies were published regarding the potential connection between Accutane and IBD, all of which concluded that Accutane is not causally associated with the development of Crohn's disease. On September 23, 2014, defendants filed a motion seeking a Kemp hearing on the association between Accutane and Crohn's disease. Defendants argued that epidemiological studies published in the scientific literature over the last several years effectively disproved any general causal association between Accutane and Crohn's disease. They contended that those studies are "the most important and reliable data that exists on Accutane and Crohn's disease" and are superior to other forms of evidence previously used in the MCL litigation such as case reports, animal studies, and theories on biological mechanisms.

The trial court scheduled a Kemp hearing to begin on February 2, 2015.

B.

The testimony focused intently on the aforementioned epidemiological studies. Accordingly, some background on the use of such studies in the formation of causal analyses provides context for the discussion of the evidence by the parties and the trial court. Much of the following discussion is taken from the Federal Judicial Center's Reference Manual on Scientific Evidence (3d ed. 2011), a source frequently relied on in the trial court proceedings.

Epidemiology "studies the incidence, distribution, and etiology of disease in human populations." Id. at 551. More particularly, epidemiology focuses on the question of "general causation," that is, whether the agent under study is "capable of causing disease," and does not focus on specific causation in a particular individual. Id. at 552. Epidemiology is premised on the idea "that disease is not distributed randomly" and that certain groups are at increased risk of contracting a particular disease. Id. at 551.

Epidemiological studies are used to test whether exposure to a particular agent causes a harmful effect or disease. Id. at 551-52. Although such studies may reveal an association between a particular agent and a particular disease, "[a]n association identified in an epidemiological study may or may not be causal," and causation must be assessed in consideration with the "strengths and weaknesses of the study's design and implementation, as well as a judgment about how the study findings fit with other scientific knowledge." Id. at 552-53. When evaluating a study's evidential value, a basic consideration is whether the particular study used a sound methodology, as well as the extent to which the study's results may be due to "bias, confounding, or sampling error." Id. at 554.

Among the different kinds of epidemiological studies, randomized trials are "considered the gold standard for determining the relationship of an agent to a health outcome or adverse side effect." Id. at 555. Where randomized trials cannot be performed, researchers rely on "observational" studies. Id. at 555-57. There are two types of such studies: (1) a case-control study, which measures and compares the frequency of exposure in the group with the disease (cases) and a similar group without the disease (controls); and (2) a cohort study, which compares a group of exposed and unexposed individuals over a period of time. Id. at 557-59. Observational studies identify a group of individuals exposed to the agent in question and then compare their rate of disease to that of an unexposed group. Id. at 556. A weakness in such studies is "the possibility of differences in the two populations being studied with regard to risk factors other than exposure to the agent." Ibid. Investigators generally cannot control for a variety of "confounders" that may distort such a study's results. Ibid. A confounder's influence on a study's results therefore must be considered in the study's design and in the interpretation of its results. Ibid. Thus, when it comes to using epidemiological studies in legal matters, three basic questions arise in the assessment of a study's methodological soundness:

1. Do the results of an epidemiologic study or studies reveal an association between an agent and disease?
2. Could this association have resulted from limitations of the study (bias, confounding, or sampling error), and, if so, from which?
3. Based on the analysis of limitations in Item 2, above, and on other evidence, how plausible is a causal interpretation of the association?
[Id. at 554.]

Once an association has been found between exposure to a particular agent and development of a specific disease, researchers then consider whether that association "reflects a true cause-effect relationship." Id. at 597. To do so, researchers look to alternative explanations, such as bias or confounding factors, and then consider how well-recognized "guidelines for inferring causation from an association apply to the available evidence." Id. at 598. However, and importantly, those accepted "guidelines are employed only after a study finds an association to determine whether that association reflects a true causal relationship." Id. at 598-99. Commonly referred to as the "Hill criteria" or "Hill factors," they consist of the following:

1. Temporal relationship,
2. Strength of the association,
3. Dose-response relationship,
4. Replication of the findings,
5. Biological plausibility (coherence with existing knowledge),
6. Consideration of alternative explanations,
7. Cessation of exposure,
8. Specificity of the association, and
9. Consistency with other knowledge.
[Id. at 600.]

The Reference Manual contains a section entitled Reference Guide on Medical Testimony, which provides a "Hierarchy of medical evidence." Id. at 723. The parties in this matter do not dispute that the Reference Manual and the scientific community as a whole acknowledge such a hierarchy. The Reference Manual summarizes it as follows: "[w]hen ordered from strongest to weakest, systematic review of randomized trials (meta-analysis) is at the top, followed by single randomized trials, systematic reviews of observational studies, single observational studies, physiological studies, and unsystematic clinical observations." Id. at 723-24. Evidence at the bottom of the hierarchy may sometimes be "the first signals of adverse events or associations that are later confirmed with larger or controlled epidemiological studies." Id. at 724.

Finally, in addition to observational epidemiology, researchers sometimes look to animal studies for determining a given agent's toxicity in humans. Id. at 563. Such studies "often provide useful information about pathological mechanisms and play a complementary role to epidemiology by assisting researchers in framing hypotheses and in developing study designs for epidemiological studies." Ibid. However, animal studies also have significant disadvantages because biological differences between humans and the animals under observation create difficulties in extrapolating data from animal studies and applying it to humans. Ibid. Even so, "[w]here both animal toxicologic and epidemiologic studies are available, no universal rules exist for how to interpret or reconcile them." Id. at 564. That said, the Reference Manual acknowledges that there are "conflicting lines of cases" regarding the weight courts tend to give animal studies, which the Reference Manual suggests may be explained by differences in the available amount of epidemiologic data across different subjects. Id. at 564-65, 564 n.48. Specifically, the Reference Manual states that "when there is a substantial body of epidemiologic evidence that addresses the causal issue, animal toxicology has much less probative value." Ibid. On the other hand, "[w]here epidemiologic evidence is not available, animal toxicology may be thought to play a more prominent role in resolving a causal dispute." Ibid.

Beginning in 2009, scientists began examining the issue of Accutane's causal relationship to Crohn's disease through the lens of epidemiological studies. The first of those studies, published by Bernstein et al. in 2009, examined approximately 21,500 subjects and concluded that "[a]lthough there may be anecdotes of [Accutane ] causing acute colitis [inflammation of the colon], our data suggest that [Accutane ] is not likely to cause chronic IBD." In 2010, a study by Crockett et al. examining approximately 29,000 subjects found "no apparent association between isotretinoin and [Crohn's disease ]," but did find a statistically significant increased risk between isotretinoin and ulcerative colitis. A study published in 2013 by Etminan et al. examining roughly 45,000 women found that there was no "increase in the risk for IBD, including UC [ulcerative colitis ] or CD [Crohn's disease ], with use of isotretinoin." Those same authors also performed a meta-analysis of the available data and again concluded that there was no association between Accutane and Crohn's disease. Etminan, supra note 9 at 216, 218-20. A study of 46,922 subjects treated with Accutane published by Alhusayen et al. in 2013 found "no significant association between isotretinoin use and IBD." In 2014, an abstract by Sivaraman et al. examining 509 subjects concluded that "Isotretinoin exposure does not appear to confer risk for either UC or CD independent of antibiotic exposure." Two other studies -- a 176,889-subject study performed by Fenerty et al. and a 1078-subject study by Rashtak et al., published in 2013 and 2014, respectively -- concluded that Accutane was not causally associated with IBD but did not include specific data for Crohn's disease. Finally, a study of approximately 44,000 subjects by Racine et al. published in 2014 concluded that Accutane use "was associated with a decreased [Crohn's disease ] risk." The interpretation of those studies is the central issue in this appeal.

C.

Plaintiffs and defendants each produced two expert witnesses. Plaintiffs produced Dr. Arthur Asher Kornbluth, a gastroenterologist, and Dr. David Madigan, a statistician. Defendants produced gastroenterologist Dr. Maria Oliva-Hemker, and biostatistician Dr. Steven Goodman.

1. Plaintiffs' Expert -- Dr. Kornbluth

Plaintiffs' expert Dr. Arthur Asher Kornbluth is a board-certified physician in internal medicine and in the subspecialty of gastroenterology. He maintains an active clinical practice treating Crohn's disease and ulcerative colitis and is also a Clinical Professor of Medicine at the Icahn School of Medicine at Mount Sinai Hospital in New York City.

Dr. Kornbluth opined that there is evidence that Accutane can cause Crohn's disease. In rendering a general causation opinion on the key question in this matter, he explained that he based his finding of a causal association on "the pathogenesis and natural history of Crohn's disease, including the relevant intestinal anatomy on the macroscopic and cellular level," "the pharmacology of Accutane and its metabolites," "reports and assessments of Crohn's disease of acute gastrointestinal toxicity in patients treated with Accutane or its metabolites," and "large clinical trials in patients with Crohn's disease evaluating the clinical benefits of blocking the mechanism of action of Accutane's metabolites." Further, in addition to his review of the scientific and medical literature, Dr. Kornbluth also looked to a number of other materials produced as part of the Accutane litigation, including defendants' internal animal studies and causality assessments, and post-marketing reports.

Dr. Kornbluth's testimony had two themes: explaining why he found the epidemiological studies unreliable and uninformative regarding the issue of causation, and explaining his reliance on other forms of evidence such as case reports, animal studies, causality assessments, and his biological mechanism hypothesis.

With regard to the epidemiological studies, Dr. Kornbluth explained that he would not rely on any study that did not provide separate data for Crohn's disease. As a result, he found the Fenerty and Rashtak studies -- both of which analyzed data for IBD generally and not for Crohn's disease specifically -- to be uninformative. As for the studies that did look specifically for Crohn's disease, Dr. Kornbluth stated that most of them, including the Alhusayen, Crockett, Etminan, and Racine studies, were fatally flawed because they did not account for Crohn's disease's"prodrome,"

meaning the period between the onset of a disease's symptoms and its actual diagnosis. Dr. Kornbluth asserted that the median prodrome for Crohn's disease is roughly two-to-four years, and that the studies in question did not account for that period because they examined study subjects only for roughly one year. Because they did not examine patient data for a sufficient period of time, Dr. Kornbluth argued that those studies missed patients whose prodromes were longer than the studies' observation periods, thus biasing the results towards a finding of no association between Accutane and Crohn's disease. Dr. Kornbluth based his opinion regarding the median prodromal period on two studies: a study by Pimentel et al. that contained 45 total subjects with Crohn's disease, and a study by Barratt et al. that contained 230 total subjects with Crohn's disease.

Dr. Kornbluth detected other problems with the epidemiological studies. He asserted that the Crockett, Bernstein, and Etminan studies did not have enough patients and thus were "underpowered," meaning that they were not properly designed to detect a statistically significant increased risk, even if such a risk actually existed, because the study size was simply too small. He asserted that the Alhusayen and Etminan studies did not adjust for certain confounders, such as family history and smoking. With regard to the Racine and Bernstein studies, he noted that those studies were performed in France and Canada, respectively, and that French and Canadian patients typically receive half the Accutane dosage that American patients receive. Moreover, he noted that the protective effect found in the Racine study's results lessened as the dosages increased, indicating a dose toxicity response. He was also dismissive of the meta-analysis that the Etminan authors performed, stating that it suffered from the same limitations as the studies that went into it and that a meta-analysis based on studies that did not perform an "adequate analysis" could not inform the issue.

Dr. Kornbluth also noted that the Bernstein and Alhusayen studies both contained positive "point estimates," indicating an increased risk. He admitted that the results of both studies were not regarded as "statistically significant," but he intimated that the lack of statistical significance was due to the studies' inherent flaws and biases.

In the end, the Sivaraman study was the only study upon which Dr. Kornbluth was willing to rely. Even so, he disagreed with the study authors' conclusion. He explained that the study's unadjusted point estimate showed that "patients with Crohn's disease were five times more likely to have taken Accutane than those who did not," and that the study's unadjusted results also showed a statistically significant increased risk. However, in explaining why he disagreed with the Sivaraman authors' conclusion that Accutane does not cause Crohn's disease, he noted that the Sivaraman authors reached their adjusted point estimate by subtracting out patients who had taken antibiotics; while that adjusted point estimate still showed a heightened risk, the smaller number of patients meant that the result lost statistical significance. Moreover, he stated that he could not understand why the study authors had taken that step because "antibiotics per se did not influence the likelihood of developing Crohn's disease." It was thus "not clear" to him why the authors made such an adjustment.

Moving beyond the epidemiological studies, Dr. Kornbluth examined other lines of evidence that he found supported his causation opinion. He began by explaining why he believes it is biologically plausible for Accutane to cause Crohn's disease, although he had not published his causation theory or otherwise submitted it for peer review. He then discussed case reports, studied by other researchers, that illustrate what he referred to as "challenge/dechallenge/rechallenge," in which a patient was given Accutane and then developed a form of intestinal inflammation, which then ceased when Accutane use was discontinued and began again upon resumption of Accutane usage. He regarded that to be "very compelling" evidence of a causative effect.

He discussed his reliance on MedWatch reports -- "reports made by physicians, patients, [and] others to the FDA describing symptoms that they think were related to ingested medications." He noted the significant number of such reports relating to Crohn's disease and other gastrointestinal problems that correlated with the use of Accutane. Similarly, he testified that he relied on defendants' own internal materials, noting that defendants' scientists determined that Accutane should be contraindicated for individuals with certain gastrointestinal problems and expressed "severe concerns about the likelihood of patients having exacerbations of Crohn's disease or even developing it de novo." Dr. Kornbluth asserted that those assessments by defendants' scientists were supportive of a causal link between Accutane and Crohn's disease.

Finally, Dr. Kornbluth relied on animal studies to support his theory of causation. He testified that defendants performed experimental tests with Accutane on dogs and that those tests showed a temporal relationship between the administration of Accutane and gastrointestinal distress. Specifically, he noted a challenge/dechallenge relationship in that certain dogs given Accutane experienced gastrointestinal problems that ceased after cessation of Accutane administration. He also noted the indication of a dose toxicity curve, meaning that intestinal damage appeared more severe in dogs given higher doses of isotretinoin. However, he conceded that animal studies are meant only to generate hypotheses about a substance's effect on humans, and that a hypothesis is a "supposition that is to be further tested."

2. Plaintiffs' Expert -- Dr. Madigan

Plaintiffs' second expert, Dr. David Madigan, is a professor of statistics at Columbia University. Dr. Madigan did not give a causation opinion. Only Dr. Kornbluth opined on causation.

Dr. Madigan's testimony and expert report focused solely on whether the epidemiological studies in question were appropriately designed to discover an association between Crohn's disease and Accutane, if such an association did in fact exist. He concluded in his expert report that, after accounting for the prodrome associated with Crohn's disease, the available epidemiological studies do not provide statistically reliable information regarding Accutane's causal relation to Crohn's disease. His testimony was consistent with his report.

Dr. Madigan explained that he was tasked with examining the available epidemiological studies as they pertained to an association between Accutane and Crohn's disease. He analyzed six of the epidemiological studies: Bernstein, Crockett, Alhusayen, Etminan, Racine, and Sivaraman. He did not analyze the Rashtak and Fenerty studies because they addressed only IBD and did not have separate results for Crohn's disease.

Dr. Madigan noted that of the six studies he considered, only two -- Racine and the unadjusted Sivaraman results -- were statistically significant. Dr. Madigan performed a "power analysis" of the four studies that were not statistically significant. He explained that his purpose in doing so was to "shed some light" on whether the studies were reliable evidence that there is truly no effect or whether the studies were simply insufficiently powered to discover an effect even if such an effect exists.

Dr. Madigan then examined each study's statistical power to detect a fifty percent increased risk; he chose fifty percent because he considered it to be "an authentically important increased risk." He calculated the power for each of the four studies as follows: Bernstein -- 37.8 percent; Crockett -- 18.2 percent; Alhusayen -- 89.4 percent; Etminan -- 22.6 percent. Thus, for example, if there is a fifty percent increased risk of developing Crohn's disease for Accutane users, the probability that the Bernstein study would find a statistically significant result is 37.8 percent. It was thus "less likely than not" that most of the studies would find an increased risk if such a risk did in fact exist. Dr. Madigan concluded that this was "more an absence of evidence than anything." After accounting for the median prodromal period, which he derived from the Pimentel and Barratt studies, he determined that the power for three of the four studies had further decreased: Crockett diminished to 5.12 percent; Alhusayen diminished to 36.2 percent; and Etminan diminished to 4.5 percent.

According to Dr. Madigan, four of the six studies -- Crockett, Alhusayen, Etminan, and Racine -- were totally unusable because they failed to account for Crohn's disease's prodrome. He explained that by failing to account for the prodrome, the studies were failing to observe cases of Crohn's disease that developed outside of the studies' observation period. In other words, the studies' failure to take account of the prodrome "biased [the studies] towards the null," meaning the studies were systematically biased in favor of not finding any effect. Dr. Madigan therefore concluded that those studies had an overriding flaw and that he was reluctant to refer to their results as "findings."

As for the remaining two studies, he explained that questionnaire-based studies such as Sivaraman are immune from the prodrome issue because they ask patients when their symptoms began and, thus, have "no limit going backwards in time." He also explained that there was a "diminished concern" for the Bernstein study due to its lengthy observation period. However, Dr. Madigan found other flaws in the Bernstein study, and he declined to rely on its findings. He stated that the study did not adjust for "unmeasured confounders" and it was performed in Canada rather than the United States, which created a problem as to "generalizability" due to differences in population and the lower dosage given to Canadian patients.

Accordingly, for Dr. Madigan, the Sivaraman study remained as the only study on which he was willing to rely, but with caveats. He did not agree with its conclusion. He pointed to the study's unadjusted results as showing a statistically significant increased risk of Crohn's disease for people taking Accutane. Although he acknowledged that the study's authors felt it necessary to adjust for antibiotic exposure, and that their adjusted results did not show a statistically significant increased risk for Crohn's disease, Dr. Madigan disputed that they actually performed an adjustment. He claimed that they actually performed a "subgroup analysis." He asserted that the adjusted figure from that analysis still showed an increased risk, even though it was no longer statistically significant due to the smaller number of patients. Accordingly, he chose to rely on the unadjusted results because a study based on a larger group of people will provide a "better estimate." He further explained that he did not understand why the study authors chose to adjust for antibiotic exposure because the study authors admitted in the study's abstract that "antibiotic exposure was not associated with Crohn's disease."

In sum, Dr. Madigan found the Sivaraman study to be "evidence of a strong association between Accutane and Crohn's disease," discounting all of the other studies on the basis that they "represented an absence of evidence" from which one cannot draw any definitive conclusions.

Dr. Madigan also took the position that a meta-analysis of the studies would be improper. He stated he did not perform a meta-analysis because most of the studies on which the meta-analysis would be based did not account for prodrome, which would render the meta-analysis both misleading and scientifically unreasonable. That is so, he explained, because a meta-analysis will inherit the flaws, biases, and structural problems of the studies upon which it is based.

In addition, Dr. Madigan performed a disproportionality analysis using the FDA's spontaneous reporting system database. He explained that a disproportionality analysis is a method of studying a spontaneous report database and reviewing the observed rate for a particular drug and a particular adverse event. The observed rate is compared with the rate at which Crohn's disease was reported for other drugs in the database. Analyzing the available data beginning in 1997, Dr. Madigan's results showed a "striking signal of disproportionality" indicative of a "strong association" between Accutane and Crohn's disease.

3. Defendants' Expert -- Dr. Oliva-Hemker

Defendants' expert Dr. Maria Oliva-Hemker is a Professor of Pediatric Inflammatory Bowel Disease and Chief of the Division of Pediatric Gastroenterology and Nutrition at Johns Hopkins University School of Medicine. Her testimony focused on disputing Dr. Kornbluth's testimony and explaining why epidemiological studies are preferred to case reports and animal studies in the hierarchy of evidence.

During the Kemp hearing, she stated that the available scientific evidence does not support any causal association between Accutane and Crohn's disease. Regarding Dr. Kornbluth's testimony on biological mechanism, she explained that Crohn's disease is idiopathic, meaning that its cause is unknown, and that any theory regarding a biological mechanism was therefore unreliable. Moreover, she stated that scientists would not ignore the available epidemiological evidence in favor of a hypothesis about a biological mechanism.

She also addressed plaintiffs' experts' opinions concerning the prodrome for Crohn's disease. Although admitting that "the diagnosis time can vary" and that there are some patients who are not diagnosed for several years, she stated that most patients are typically diagnosed within a year. She based that conclusion on a number of population-based studies, all of which concluded that the median prodrome is under one year. She was dismissive of plaintiffs' experts' use of the Pimentel study for determining the median prodrome for Crohn's disease, explaining that the study was very small and not representative of the average prodromal period, especially in light of the other, larger studies on the issue. She added that the authors of the epidemiological studies would have been aware of the available published data on the median prodromal period and would have accounted for it in designing their studies.

She also discussed the recognized hierarchy of evidence and that certain types of data are meant to "generate hypotheses," whereas other, more reliable forms of evidence are meant to "test hypotheses." She stated that the epidemiological studies are currently the best available data on the issue of Accutane and Crohn's disease and that, from a medical-evidence point of view, Dr. Kornbluth's reliance on case reports and animal studies, which are considered lower forms of evidence, was inconsistent with recognized methodology. Concerning case reports, she testified that they are hypothesis-generating in nature and are "in the bottom tier of medical evidence." She explained that case reports are subject to "publication bias," meaning that only correlative events are reported and that the reports are subjective in nature, often contain incomplete information, and do not account for random chance. Further, she stated that such reports are difficult to interpret because Crohn's disease is a "relapsing and a remitting condition," and thus "the natural course of the disease may be mistaken for the results of removing or reintroducing the medication." She also cited to an article by Reddy et al. for the proposition that epidemiological evidence is needed to confirm the inference of causality generated by a case report.

Finally, Dr. Oliva-Hemker addressed Dr. Kornbluth's interpretation of defendants' animal studies. She stated that animal studies may sometimes be important for "generating certain hypotheses," but that they are typically considered below human studies in the hierarchy of medical evidence. She noted that it is difficult to extrapolate data from animals to humans because of differences in metabolism, absorption, and other factors. Moreover, she testified that she would not rely on dog studies in this instance because dogs cannot get IBD and because recent research has shown that dogs may have hypersensitive gastrointestinal tracts that can result in distress owing more to the dog's anatomy than to the actual toxicity of the substance under observation. In sum, her testimony concluded that the animal studies are simply not sufficient generally for forming a conclusion about causation here both because of the nature of the disease at issue and because they are merely "hypothesis-generating data."

In contrast to the case reports and animal studies, she testified that the epidemiological studies are the best available evidence on the issue of Accutane's relation to Crohn's disease. Moreover, she testified that it would be entirely proper to consider a meta-analysis of the studies when such analysis is available. She explained that researchers now have access to the types of "hypothesis testing" epidemiological studies that were sought for testing the viability of adverse event case reports and that those studies have all reached similar findings of no causal effect despite variations in populations, time, and data sets.

4. Defendants' Expert -- Dr. Goodman

Defendants' second expert, Dr. Steven Goodman, is a Professor of Medicine and Health Research and Policy and Associate Dean for Clinical and Translational Research at Stanford University, where he serves as Chief of the Division of Epidemiology and co-director of Stanford's Meta-Research Innovation Center. Dr. Goodman's testimony focused on why the epidemiologic evidence is the best available evidence on the question of Accutane's causal relation to Crohn's disease and why a meta-analysis was a proper way of pooling those study results to reach a conclusion that Accutane does not cause Crohn's disease.

Dr. Goodman stated, during the Kemp hearing and in his expert report, that the epidemiological evidence pertaining to the causal relationship between Accutane and Crohn's disease is as strongly negative as epidemiologic evidence can be, and that there was no biological evidence, including a causal mechanism or otherwise, to contravene that evidence. He stated that the methods Doctors Madigan and Kornbluth used were flawed because they put almost no weight on the epidemiologic evidence and instead relied heavily on the much lesser forms of evidence, which is not how any scientific body would have proceeded. Dr. Goodman emphasized that there is now a consistent body of epidemiological evidence all pointing towards no causal association between Accutane and Crohn's disease and that the weighing of that body of evidence through use of meta-analysis also strongly supported the lack of any association or causal effect between Accutane and Crohn's disease.

Dr. Goodman discussed the hierarchy of medical evidence, stating that observational studies are second only to randomized controlled trials in terms of establishing causality. Like Dr. Oliva-Hemker, he explained that epidemiologic studies are a hypothesis-testing form of evidence, which is a higher form of evidence than hypothesis-generating data, such as case reports and animal studies. Moreover, hypothesis-generating evidence is typically used more for developing ideas to later be examined through epidemiological evidence. Accordingly, he was dismissive of any reliance on case reports here, which he stated are almost never used as a basis for a scientific determination of causality. As support for that proposition, Dr. Goodman cited the Reference Manual, which states that case reports are "at the bottom of the evidence hierarchy" and must later be "confirmed with larger, more controlled epidemiological studies."

For similar reasons, Dr. Goodman was also dismissive of Dr. Kornbluth's interpretation of animal studies, explaining that such studies are hypothesis-generating in nature and that it is difficult to extrapolate results from animal studies to determine the effect a given substance will have on a human subject. Moreover, he believed that the specific animal studies discussed in this matter were essentially meaningless because dogs do not develop IBD. He also took issue with Dr. Kornbluth's invocation of the Hill criteria in assessing the causal relationship between Accutane and Crohn's disease because the studies here showed no strength of association, and so there was no "association" in need of study.

After reviewing all of the epidemiologic evidence, Dr. Goodman stated that none of the studies showed a statistically significant increased risk of developing Crohn's disease from use of Accutane. In addition to his belief that each study contained reasonable results, he also noted the importance of the fact that all of the studies were concordant with each other -- that all of the studies produced consistent results pointing in the same direction was strong evidence that those results are reliable.

With regard to the Sivaraman study, he stated that the study should be interpreted according to its adjusted numbers, and that it was proper for the study authors to adjust for antibiotics. He explained that the adjusted number is always better because it tends to be less biased and therefore more reliable. Moreover, he found the Sivaraman study to be insignificant for purposes of his meta-analysis and in the overall scheme of evidence because of its small size.

He also explained the methodology of the meta-analysis he performed, describing how smaller and more imprecise studies are given less weight whereas larger and more precise studies are given more weight. He further stated that a meta-analysis is a way to increase precision by pooling studies that address the same question. Moreover, meta-analysis is useful where there is a question about whether independent studies are large enough to detect an effect size. Accordingly, he criticized Dr. Madigan's refusal to perform a meta-analysis of the studies despite arguing that most of the studies were underpowered because meta-analysis was "in a sense invented" to address power concerns.

He then went through the results of his meta-analysis, which he performed for both IBD and Crohn's disease. For IBD, the meta-analysis resulted in a relative risk indicative of a protective effect, but with a statistically insignificant confidence interval. For Crohn's disease, the meta-analysis again returned a relative risk indicative of a protective effect, with a non-statistically significant confidence interval.

Dr. Goodman interpreted those results as consistent with "no effect."

Finally, Dr. Goodman turned his attention to the criticisms levied against the epidemiological studies -- specifically the prodrome issue on which plaintiffs' experts chiefly relied in disregarding the studies. Dr. Goodman criticized plaintiffs' experts' reliance on the Pimentel and Barratt studies for determining a median prodrome length of at least two-to-four years because of the studies' small size and because the Pimentel study's population was taken from patients who gastroenterologists found difficult to manage and diagnose. Dr. Goodman felt that a study by Chouraki et al., which contained 7409 subjects with Crohn's disease and found a median prodrome of under a year, was much more reliable. Moreover, he stated that every population-based prodrome study -- which are the studies he believed most reliable -- found a prodrome of nine months or less. He thus concluded that the best available evidence on the issue consistently pointed towards a median prodrome of less than a year, which justified the epidemiological studies' use of a one-year observation period. Thus, in his opinion, none of the epidemiological studies were invalid due to a prodrome issue.

D.

After the Kemp hearing, the trial court issued an order granting defendants' omnibus motion to bar plaintiffs' experts from testifying on, among other things, whether the epidemiological studies on which the defense relied were flawed and unreliable and whether Accutane can cause Crohn's disease. The trial court also directed the parties to prepare an order listing the lawsuits affected by the ruling, and subsequently issued a May 8, 2015 order dismissing 2076 affected claims with prejudice.

In its decision concerning the exclusion of plaintiffs' expert witnesses, the trial court examined the expert testimony and scientific studies, laid out the relevant standard for the admission of expert witness testimony, and determined that plaintiffs' experts' testimony did not meet the applicable standard.

The trial court stated that Rubanick governed the admissibility of expert witness testimony in toxic tort cases in New Jersey. The court regarded the Rubanick standard, which it understood to be more flexible in assessing medical causation expert testimony than the "general acceptance" test of Frye otherwise traditionally used in New Jersey courts, as requiring an expert opinion to be based on a "sound, adequately-founded scientific methodology involving data of the type reasonably relied on by experts in the scientific field." (citing Rubanick, 125 N.J. at 449, 593 A.2d 733 ). Thus, to fulfill its duty as gatekeeper, the trial court considered "whether other scientists in the field [are] using similar methodologies in forming their opinions."

The trial court applied that standard and found plaintiffs' experts' testimony lacking. Focusing on the epidemiological studies, the trial court concluded that "there is no epidemiological evidence to justify a reasonable inference that there is a causal link between isotretinoin and [Crohn's disease ]." Nor did the court believe that there was any rational basis for plaintiffs to resist the findings of all the epidemiological studies and to rely instead on case reports and animal studies, which the trial court determined were seriously flawed and a less reliable form of evidence than the epidemiological studies.

The trial court viewed Dr. Kornbluth and Dr. Madigan as "self-validating expert[s]" who were unwilling to subject their ideas for evaluation in the scientific community, either through peer review submission or through the scrutiny of the process of publication in scientific literature. Specifically regarding Dr. Kornbluth, the trial court stated that he "want[ed] to have it both ways" by rejecting the best available evidence as flawed and yet relying on inferior forms of evidence. As for Dr. Madigan, the trial court viewed his refusal to perform a meta-analysis and to instead rely on the Sivaraman study for causation and the Pimentel and Barratt studies for median prodrome, to the exclusion of all other studies, as an attempt to explain away the body of evidence on causation and on the median prodromal period.

The trial court therefore determined that plaintiffs' experts' examination of the evidence was a "conclusion-driven" attempt to cherry-pick evidence supportive of their opinion while dismissing other, better forms of evidence that did not support their opinion. The trial court believed that such a "stratagem cannot bridge the analytical gaps inherent in Plaintiffs' hypothesis."

Plaintiffs appealed and the Appellate Division reversed, holding that plaintiffs may present the experts' testimony at trial. In re Accutane Litig., 451 N.J. Super. 153, 163-64, 165 A.3d 832 (App. Div. 2017). After providing an overview of plaintiffs' experts' testimony, underlying scientific principles, and applicable legal standards, the panel concluded that "the experts relied on methodologies and data of the type reasonably relied upon by comparable experts." Id. at 199, 165 A.3d 832. Comparing the testimony of plaintiffs' experts and defendants' experts, the panel stated that plaintiffs' experts evaluated all of the evidence in accordance with established scientific standards and methodology and addressed the specific design flaws of the epidemiological studies, all of which are recognized in the scientific community as capable of producing unreliable results. Id. at 202, 165 A.3d 832. The panel explained that defendants' experts merely "interpret[ed] the epidemiological studies differently," and that a difference of opinion between the experts did not mean that plaintiffs' experts failed to rely upon a sound methodology. Id. at 202-03, 165 A.3d 832.

The panel found plaintiffs' experts to be "extremely well-qualified" and underscored that they "considered all of the relevant data and information, applied appropriate methodology in analyzing the epidemiological studies, and expressed valid reasons for rejecting the conclusions of some of the epidemiological studies and in accepting other studies as supportive of their opinion." Id. at 205, 165 A.3d 832. While noting the trial court's opportunity to view the witnesses firsthand, the panel disagreed with the trial court's characterization of plaintiffs' experts as "hired gun[s]," id. at 206, 165 A.3d 832, and expressed the view that the trial court's negative reaction to plaintiffs' experts was not supported by the trial record, id. at 211, 165 A.3d 832.

The panel further noted that, although a trial court's decision to admit or exclude evidence is subject to an abuse of discretion standard, a reviewing court owes "somewhat less deference to a trial court's determination[s]" regarding expert testimony. Id. at 196-97, 165 A.3d 832. Accordingly, because our evidence rules weigh strongly in favor of admissibility, the panel "conclude[d] that the [trial] court mistakenly applied its discretion in excluding the expert scientific testimony." Id. at 206, 165 A.3d 832.

We granted defendants' petition for certification. 231 N.J. 531, 177 A.3d 112 (2017). In addition, numerous parties were granted amicus status.

HealthCare Institute of New Jersey, the New Jersey Business & Industry Association, Commerce and Industry Association of New Jersey, and New Jersey Chamber of Commerce (collectively "the Industry Associations"); Kenneth S. Broun, Daniel J. Capra, Joanne A. Epps, David L. Faigman, Laird Kirkpatrick, Michael M. Martin, Liesa Richter, and Stephen A. Saltzburg (collectively "the Academics"); the American Medical Association, Medical Society of New Jersey, American Academy of Dermatology, Society for Investigative Dermatology, American Acne and Rosacea Society, and Dermatological Society of New Jersey (collectively "the Medical Associations"); the Pharmaceutical Research and Manufacturers of America; the New Jersey Civil Justice Institute (NJCJI); and DRI - The Voice of the Defense Bar (DRI), filed briefs supportive of defendants.

The New Jersey Association for Justice (NJAJ); the Ironbound Community Corporation (ICC); and Allan Kanner, Esq. (Kanner), filed briefs supportive of plaintiffs.

III.

A.

Defendants argue that the Appellate Division's decision effectively nullifies the trial court's role as the gatekeeper of expert witness testimony and will "allow[ ] any credentialed expert to argue their way to a jury." They contend that the appellate panel did not address the methodological inconsistencies inherent in plaintiffs' experts' reasoning and adopted a restrictive interpretation of the trial court's role as the "gatekeeper" of expert witness testimony that is at odds with precedent from both New Jersey and courts around the country. Defendants assert that the acceptance of such "internally-inconsistent" and "outcome-driven" testimony robs the expert witness standard of its vitality. They ask us to "bring clarity and consistency to New Jersey expert admissibility standards, including by addressing whether Daubert standards and precedent are relevant."

The current lack of clarity, defendants posit, has resulted in "vastly different applications of gatekeeping," as illustrated by the opposing viewpoints of the trial court and Appellate Division in this matter. Moreover, defendants argue that, "[a]lthough the Court need not adopt the federal Daubert standard to find that the Appellate Division erred, this case illustrates the practical benefits of doing so." Defendants add that the extent to which New Jersey courts may look to Daubert for guidance is currently unclear; they urge this Court to clarify that issue here.

With respect to the proffered expert scientific testimony, defendants contend that the Appellate Division failed to apply methodological scrutiny, improperly applying a "relaxed" standard for admissibility despite the existence of "well-developed science." By doing so, the Appellate Division allowed plaintiffs' experts to employ a methodology whereby they argued away better and more reliable forms of evidence such as epidemiological studies to rely on lesser forms of evidence such as case reports and animal studies. Thus, they contend that the Appellate Division ignored "key guidelines from this and other courts for assessing the reliability of expert testimony."

Specifically, defendants argue that the Appellate Division did not consider whether the experts used the data as it is used by scientists in the field, a requirement mandated by Rubanick; condoned the experts' failure to adhere to the hierarchy of evidence; and failed to look for any demonstration of scientific consensus for the experts' methods or views, such as peer-reviewed articles or treatises. And, they argue that the Appellate Division did not give appropriate deference to the trial court's exercise of discretion, but rather conducted essentially a de novo review.

In sum, defendants argue that appellate error has resulted in the allowance of expert testimony that is contradictory, unreliable, and logically incoherent. Defendants emphasize that none of the epidemiological studies concluded that there was a connection between Accutane and Crohn's disease. Doctors Madigan and Kornbluth were thus forced to use an inconsistent methodology whereby they "concocted" a theory on Crohn's disease's prodrome and selectively applied that theory to evidence that did not support their viewpoints. They did the same for their theories on study power, applying it where necessary to refute the evidence, but then ignoring it for the studies upon which they relied. Defendants argue that the experts' "contradictory methodology" should not be allowed to be advanced before a jury.

B.

Plaintiffs argue that the Appellate Division appropriately employed Rubanick's relaxed admissibility standard in this matter and properly assessed the reliability of plaintiffs' experts' methodology. Plaintiffs cite to the Reference Manual for the proposition that "epidemiology alone cannot prove causation," and contend that defendants have falsely categorized the nature of the expert testimony due to their erroneous belief that the epidemiological studies alone should be considered to the exclusion of all other evidence. According to plaintiffs, epidemiological studies "are not the be all and end all of causation evidence," but rather "one component of multiple lines of evidence that inform the causation issue." Thus, by considering all of the evidence in addition to the epidemiology, plaintiffs argue their experts employed a methodology based on sound scientific principles accepted in the scientific community. They further argue that the Appellate Division correctly found that their experts did not ignore the epidemiology, but rather examined the data with proper consideration of the strengths and limitations of the design of each of the studies, together with study biases. Accordingly, the Appellate Division properly held that the trial court exceeded its gatekeeping function in excluding the expert testimony.

Furthermore, plaintiffs argue that the Reference Manual does not endorse a "strict application" of the hierarchy of evidence. In fact, the Reference Manual does not suggest that epidemiological studies are "beyond scientific criticism" or that "no countervailing evidence should be considered." Plaintiffs thus contend that their experts did not stray from any "core scientific principles," but simply considered all of the evidence in forming their opinions. Plaintiffs posit that their experts merely viewed the epidemiological studies differently than defendants' experts, and that doing so is not improper because the "implications of a study are open to debate."

Plaintiffs also dispute the "key guidelines" identified by defendants for examining expert testimony. Plaintiffs note that the law does not require that experts submit their opinions for peer review in order to be admissible. Plaintiffs also take issue with defendants' arguments concerning scientific consensus, arguing that consensus is not required in toxic tort cases and that, by arguing the issue, defendants are seeking to take New Jersey back to the "general acceptance" standard. Furthermore, plaintiffs contend that defendants' argument for abandoning Rubanick's relaxed standard is at odds with this Court's interpretation of the trial court's gatekeeping role and would allow a trial court to usurp the role of the jury and determine for itself whether to accept an expert's opinion.

Plaintiffs also argue that the Appellate Division gave appropriate deference to the trial court, explaining that "less deference is owed to a trial court when the issue is the admissibility of expert proofs." They claim that the trial court failed to adhere to the principles expounded in Rubanick, which provided "well-founded support for the level of review that the Appellate Division employed." Thus, because the trial court substituted its judgment for that of the jury and improperly weighed the evidence, the Appellate Division properly reversed by applying a standard of review "long recognized by this Court." Finally, plaintiffs argue that there is no reason to import the Daubert standard into New Jersey law and note that this Court has previously declined to adopt the Daubert factors.

C.

Amici curiae Industry Associations, the Academics, and DRI argue that New Jersey's expert witness standard is in need of clarification and urge this Court to join the majority of other states by adopting the Daubert standard to ensure meaningful judicial gatekeeping and that only reliable and reliably applied expert testimony enters New Jersey's courts. Similarly, amici curiae NJCJI and the Medical Associations argue that the Appellate Division reached an incorrect decision even under existing New Jersey law, but maintain that adoption of the Daubert standard will provide helpful guidance and ensure meaningful and robust gatekeeping in New Jersey trial courts.

Specifically, the Industry Associations and the Academics maintain that although Daubert and the current standard are somewhat similar, Daubert provides "concrete guidelines" and a more fulsome analysis that ensures that the expert's methodology is reliable and applied in a way that "fits" the facts of a case. Both contend that plaintiffs' experts' testimony would not have withstood that more thorough analysis. Moreover, both argue that tying New Jersey closer to the Daubert standard would discourage forum shopping. Finally, the Academics assert that the Appellate Division should have applied a pure abuse of discretion standard in reviewing the trial court's ruling.

The Medical Associations argue that the Appellate Division incorrectly applied New Jersey's expert witness standard by ignoring the hierarchy of evidence and the unanimity of epidemiological evidence in favor of plaintiffs' experts' "un-vetted postulations." They point to a scientific consensus in the medical literature that Accutane does not cause Crohn's disease, and argue that plaintiffs' experts were forced to invert the hierarchy of scientific evidence to overcome the scientific consensus on the issue by relying on inferior forms of evidence such as case reports and animal studies.

NJCJI similarly argues that the Appellate Division permitted plaintiffs' experts to present their unsubstantiated "outlier viewpoint" without any meaningful judicial scrutiny or investigation of their methodology. NJCJI claims that the Appellate Division's decision improperly left the question of methodological soundness to the jury while limiting the trial court's gatekeeping function "to exclude only the most extreme and obvious forms of junk science."

NJCJI and DRI emphasize that robust gatekeeping is necessary because juries struggle to absorb complex scientific concepts and are poorly equipped to assess methodological soundness. Both amici express the concern that juries may be misled by highly-qualified experts who offer opinions that are not supported by the wider scientific community and that juries faced with complex scientific evidence may simply "fall back" on an expert's credentials as a basis for evaluating the testimony at issue. To guard against that risk, DRI argues that experts should be required to prove not only that their methodology is sound, but that such methodology is reliably applied to the facts of the case.

Finally, amicus curiae Pharmaceutical Research and Manufacturers of America submitted a brief detailing the scientific principles at issue here and arguing that the trial court correctly excluded plaintiffs' experts' testimony.

D.

Amici curiae NJAJ, ICC, and Kanner argue that this Court should not adopt the Daubert standard because the current New Jersey standard is well-settled and strikes an appropriate balance between proper judicial gatekeeping and the admission of novel scientific concepts. NJAJ claims that the Daubert standard is a "patchwork" of case law that has been applied inconsistently by both state and federal courts. NJAJ and Kanner further contend that the Daubert standard requires judges who are unfamiliar with scientific principles to make scientific judgments outside of their area of expertise. NJAJ and ICC argue that Daubert imposes undue and unnecessary burdens on courts and litigants by encouraging frivolous challenges to expert witness testimony. Both ICC and Kanner claim that adoption of the Daubert standard will lead to unjust results because it will result in the exclusion of reliable evidence, which will bar access to justice for innocent victims.

The NJAJ further argues that New Jersey's expert witness standard does not require any clarification or correction and that state trial courts have reliably applied it for many decades. NJAJ contends that defendants have provided no substantive reason why this Court should abandon the expert witness standards that it pioneered and "which continue to fairly promote just results." Furthermore, NJAJ disputes that New Jersey's current expert witness standard promotes the filing of cases in New Jersey by out-of-state plaintiffs, arguing that there is no evidence to support such an assertion. Finally, NJAJ, ICC, and Kanner maintain that the Appellate Division here applied the proper standard and properly found that the trial court exceeded its gatekeeping role by excluding plaintiffs' experts based on nothing more than personal disagreements as to their conclusions.

IV.

A.

Our Court was in the vanguard of courts to be persuaded that adherence to the Frye general acceptance standard as the sole test for assessing reliability of scientific expert testimony was unsatisfactorily constricting for fairly assessing reliability in certain areas of novel or emerging fields of science.

Rubanick marked the broadening of our standard, when this Court concluded that resort to a methodology-based standard would be appropriate for assessing reliability with respect to emerging scientific theory on causation in toxic tort litigation. 125 N.J. at 454, 593 A.2d 733. Justice Handler, writing for the Court, explained the impetus for the holding:

[T]oxic-tort litigation does not frequently encounter well-established and widely-accepted scientific theories of causation that can, at the level demanded by the scientific method, precisely delineate the causal path between the toxin and the pathology. Nevertheless, in such litigation there is often available data and information of a type that is used and relied on by experts in the field; further, there are reputable and highly qualified experts who, drawing on such data and information, have the proficiency to apply sound scientific methods sufficient to reach creditable opinions with respect to causation. We are thus strongly persuaded that a standard that accounts for those considerations should be employed to determine the reliability of expert opinion testimony relating to causation in toxic-tort litigation.
Accordingly, we hold that in toxic-tort litigation, a scientific theory of causation that has not yet reached general acceptance may be found to be sufficiently reliable if it is based on a sound, adequately-founded scientific methodology involving data and information of the type reasonably relied on by experts in the scientific field.
[ Id. at 449, 593 A.2d 733.]

The Court further instructed courts to "consider whether others in the field use similar methodologies. 'What is necessary is that the expert arrived at his causation theory by relying upon methods that other experts in his field would reasonably rely upon in forming their own, possibly different opinions, about what caused the patient's disease.' " Id. at 449-50, 593 A.2d 733 (quoting Osburn v. Anchor Labs., Inc., 825 F.2d 908, 915 (5th Cir. 1987) ).

In remanding to the trial court for re-evaluation of disallowed expert testimony, the Rubanick Court explained that the proper inquiry is not whether the expert thought his reliance on the underlying data from thirteen studies regarding exposure to a potential carcinogen was reasonable or whether the trial court thought that reliance was reasonable; rather, the proper inquiry is whether comparable " 'experts in the field [would] actually rely' on that information." Id. at 451-52, 593 A.2d 733 (alteration in original) (quoting Ryan v. KDI Sylvan Pools, Inc., 121 N.J. 276, 289, 579 A.2d 1241 (1990) ).

One year later, in Landrigan, Justice Pollock, writing for the Court, applied the same approach and again remanded for a hearing on the disputed epidemiologic testimony about asbestos and colon cancer. 127 N.J. at 418-23, 605 A.2d 1079. The Court stated that, for its purposes, it did not need to "describe in detail how to structure an epidemiological study, analyze the data, draw conclusions about the study population, and, if possible, extrapolate from statistical results inferences about specific individual subjects[, i.e., determine specific causation]." Id. at 417, 605 A.2d 1079. But, Justice Pollock elaborated on the Court's direction to trial courts when assessing the reliability of a methodology used by an expert proffering scientific evidence: "epidemiologists, like experts generally, must be able to identify the factual bases for their conclusions, explain their methodology, and demonstrate that both the factual bases and the methodology are scientifically reliable." Ibid. When relying on such studies, the trial court should review them and "then determine whether the expert's opinion is derived from a sound and well-founded methodology that is supported by some expert consensus in the appropriate field." Ibid. (citing Rubanick, 125 N.J. at 449-50, 593 A.2d 733 ).

Moreover, Landrigan provided suggested tools for trial courts to use in rendering gatekeeping determinations about the reliability of an expert's methodology when the ultimate scientific opinion is not itself generally accepted. Landrigan explains that "[d]efined landmarks guide a trial court in making this determination. Support may be demonstrated by reference to professional journals, texts, conferences, symposia, or judicial opinions accepting the methodology." Ibid. (citing Kelly, 97 N.J. at 210-11, 478 A.2d 364 ). Additionally, the Court allowed for consideration of professional associations' acknowledged acceptance and recognition of a methodology's use. Ibid. Thus, methodology may be assessed for soundness using some of the same tools as general acceptance identifies for outcome.

B.

Not long after those dual holdings by our Court, the Supreme Court issued its seminal Daubert opinion in 1993 pronouncing that Frye had been superseded by the adoption of the Federal Rules of Evidence. 509 U.S. at 585-87, 113 S.Ct. 2786. The Daubert Court explained that the Federal Rules of Evidence do not make any mention of a general acceptance standard and that such a standard was at odds with the "liberal thrust of the Federal Rules and their general approach of relaxing the traditional barriers to opinion testimony." Id. at 588, 113 S.Ct. 2786 (internal quotations omitted). Thus, although Federal Rule of Evidence 702"contemplates some degree of regulation of the subjects and theories about which an expert may testify," the subject of the scientific testimony does not have to be known to a certainty so long as it is derived from the scientific method and "supported by appropriate validation." Id. at 589-90, 113 S.Ct. 2786. Moreover, in addition to a methodology derived from the scientific method, the Court added that the testimony must be relevant to the facts at hand -- it must "assist the trier of fact to understand the evidence or to determine a fact in issue." Id. at 591, 113 S.Ct. 2786 (quoting Fed. R. Evid. 702).

Tying the principles together, the Court fashioned a new standard:

Faced with a proffer of expert scientific testimony ... the trial judge must determine at the outset ... whether the expert is proposing to testify to (1) scientific knowledge that (2) will assist the trier of fact to understand or determine a fact in issue. This entails a preliminary assessment of whether the reasoning or methodology underlying the testimony is scientifically valid and of whether that reasoning or methodology properly can be applied to the facts in issue.
[ Id. at 592-93, 113 S.Ct. 2786 (footnotes omitted).]

The Court based that standard on the concept that the Federal Rules regarding expert testimony are "premised on an assumption that the expert's opinion will have a reliable basis in the knowledge and experience of his discipline." Id. at 592, 113 S.Ct. 2786. The Court noted that "[m]any factors will bear on the inquiry," and then proceeded to offer some observations on factors it believed most relevant in such evaluations. Id. at 593, 113 S.Ct. 2786. Those became known as the Daubert factors. First, trial courts may look to whether the scientific theory at issue can be, or has been, tested. Ibid. Second, a court may also consider whether the scientific theory has been published or subjected to some form of peer review. Ibid. That said, the Court did not consider publication as "a sine qua non of admissibility," but rather one form of peer review. Ibid. The Court reasoned that "submission to the scrutiny of the scientific community is a component of 'good science,' in part because it increases the likelihood that substantive flaws in methodology will be detected." Ibid. As a third factor, courts may also consider any "known or potential rate of error," and any "standards controlling the technique's operation," id. at 594, 113 S.Ct. 2786, which may be important in pattern testing and similar areas of science. Finally, the Daubert Court stated that general acceptance remains a consideration; accordingly, a technique that has garnered only minimal support within the scientific community "may properly be viewed with skepticism." Ibid.

In sum, the Court described the trial court's task as a "flexible" inquiry into the scientific principles at issue, ibid., one whose "overarching subject is the scientific validity -- and thus the evidentiary relevance and reliability -- of the principles that underlie a proposed submission," id. at 594-95, 113 S.Ct. 2786. As this Court did in Rubanick and Landrigan, the Supreme Court underscored in Daubert that the trial court must focus on the expert's principles and methodology -- not on the conclusions they generate. Id. at 595, 113 S.Ct. 2786. The trial court's task is thus to ensure "that an expert's testimony both rests on a reliable foundation and is relevant to the task at hand" by assuring that the evidence is based on valid scientific principles. Id. at 597, 113 S.Ct. 2786.

Although acknowledging that its decision was controversial, the Court asserted that it had struck a proper balance, stating that the adversarial process would provide proper safeguards in place of the more "uncompromising 'general acceptance' test," and that the trial judge's role as gatekeeper was essential for the quick and decisive resolution of legal disputes. Id. at 596-97, 113 S.Ct. 2786.

The Supreme Court elaborated on its Daubert standard for assessing reliability with two cases which, combined, round out the Daubert trilogy.

In General Electric Co. v. Joiner, the Court held that an abuse of discretion standard applies when reviewing a trial court's decision to admit or exclude expert testimony, even where that determination may be outcome determinative. 522 U.S. 136, 138-39, 118 S.Ct. 512, 139 L.Ed.2d 508 (1997). The Court also reinforced that trial courts are the "gatekeeper" tasked with screening such testimony. Id. at 142, 118 S.Ct. 512. The Court stated that, in its gatekeeper role, a trial court is free to exclude expert testimony where the expert's conclusions are not sufficiently tethered to the facts or drawn from the applicable data. Id. at 146-47, 118 S.Ct. 512. A trial court may determine in a given case that "there is simply too great an analytical gap between the data and the opinion proffered" for the expert testimony to be considered reliable. Id. at 146, 118 S.Ct. 512 (explaining that expert's conclusions and methodology "are not entirely distinct from one another" and that "nothing in either Daubert or the Federal Rules of Evidence requires a district court to admit opinion evidence that is connected to existing data only by the ipse dixit of the expert").

In Kumho Tire Co. v. Carmichael, the Court extended the Daubert approach to technical and other specialized knowledge admissible as expert testimony under Federal Rule of Evidence 702. 526 U.S. 137, 147-48, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999). Justice Breyer summarized Daubert as holding "that Federal Rule of Evidence 702 imposes a special obligation upon a trial judge to 'ensure that any and all scientific testimony ... is not only relevant, but reliable.' " Id. at 147, 119 S.Ct. 1167 (ellipsis in original) (quoting Daubert, 509 U.S. at 589, 113 S.Ct. 2786 ). The Kumho Court reasoned that Rule 702 -- and thus Daubert's methods for assessing reliability -- had to apply to all forms of expert testimony because the evidence rules "grant expert witnesses testimonial latitude unavailable to other witnesses on the 'assumption that the expert's opinion will have a reliable basis in the knowledge and experience of his discipline.' " Id. at 148, 119 S.Ct. 1167 (quoting Daubert, 509 U.S. at 592, 113 S.Ct. 2786 ).

Importantly, the Court in Kumho emphasized again that the Daubert standard is flexible, explaining that (1) the Daubert factors do not necessarily apply "to all experts or in every case," id. at 141, 119 S.Ct. 1167 ; (2) that "the law grants a district court the same broad latitude when it decides how to determine reliability as it enjoys in respect to its ultimate reliability determination," id. at 142, 119 S.Ct. 1167 ; (3) that the Daubert factors are not a "definitive checklist or test," id. at 150, 119 S.Ct. 1167 (quoting Daubert, 509 U.S. at 593, 113 S.Ct. 2786 ); and (4) that the gatekeeping inquiry must be " 'tied to the facts' of a particular 'case,' " ibid. (quoting Daubert, 509 U.S. at 591, 113 S.Ct. 2786 ). Thus, a trial court "can neither rule out, nor rule in, for all cases and for all time the applicability of the factors mentioned in Daubert.... Too much depends upon the particular circumstances of the particular case at issue." Ibid.

Ultimately, Kumho underscores that the objective of Daubert's gatekeeping requirement "is to make certain that an expert, whether basing testimony upon professional studies or personal experience, employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field." Id. at 152, 119 S.Ct. 1167. Accordingly, "the trial judge must have considerable leeway in deciding in a particular case how to go about determining whether particular expert testimony is reliable." Ibid.

Federal Rule of Evidence 702 was amended in 2000 to reflect the Supreme Court's trilogy of cases outlining the Daubert standard. See, e.g., Calhoun v. Yamaha Motor Corp., U.S.A., 350 F.3d 316, 320-21, 320 n.8 (3d Cir. 2003). That rule as currently written provides:

A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if:
(a) the expert's scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;
(b) the testimony is based on sufficient facts or data;
(c) the testimony is the product of reliable principles and methods; and
(d) the expert has reliably applied the principles and methods to the facts of the case.
[ Fed. R. Evid. 702.]

A majority of states have adopted some form of the Daubert standard, either explicitly or implicitly. See, e.g., State v. Porter, 241 Conn. 57, 698 A.2d 739, 746 (1997) (adopting Daubert ); M.G. Bancorp. v. Le Beau, 737 A.2d 513, 522 (Del. 1999) (same); Motorola, Inc. v. Murray, 147 A.3d 751, 756-57 (D.C. 2016) (en banc) (adopting Daubert and noting that standard is "widely used").

C.

In 2002, after the Daubert trilogy, we revisited the topic of the trial court's gatekeeping role under our current N.J.R.E. 702.

In Kemp, 174 N.J. 412, 809 A.2d 77, we extended the applicability of Rubanick beyond toxic tort cases. Kemp holds that the Rubanick standard for assessing the reliability of proffered expert testimony on scientific evidence should apply whenever "a medical cause-effect relationship has not been confirmed by the scientific community but compelling evidence nevertheless suggests that such a relationship exists." Id. at 430, 809 A.2d 77.

The Kemp decision further holds that a trial court has an independent obligation to ensure that plaintiffs have sufficient process for defending their evidentiary submissions. The Court explained that, by requiring a pretrial Rule 104 evidentiary hearing, the trial court would be able to properly "assess whether the expert's opinion is based on scientifically sound reasoning or unsubstantiated personal beliefs couched in scientific terminology." Id. at 427, 809 A.2d 77 (citing Landrigan, 127 N.J. at 414, 605 A.2d 1079 ).

We note, in concluding this section, that the Kemp Court observed in 2002 that New Jersey had not amended N.J.R.E. 702 to include "the three-factor test for the admissibility of expert testimony that is part of the Federal rule as amended in response to Daubert." Id. at 424 n.3, 809 A.2d 77. The Court added that its decision was not intended "to incorporate the Daubert factors into N.J.R.E. 702." Ibid. Although the question of whether to incorporate the Daubert factors into N.J.R.E. 702's standard for admissibility, or to otherwise revise N.J.R.E. 702 to incorporate evolving state case law on the reliability part of our standard, has arisen from time to time in the Court's interactions with the Supreme Court Committee on the Rules of Evidence, no action on such topics has been taken.

V.

We intend by this case to clarify and reinforce the proper role for the trial court as the gatekeeper of expert witness testimony. Defendants and several amici have good reason to ask for clarification of the judicial gatekeeping role to be performed in New Jersey courtrooms.

When this Court modified the general acceptance standard to adopt a more relaxed approach for causation expert testimony in toxic tort litigation, and later for all medical cause-effect expert testimony, it envisioned the trial court's function as that of a gatekeeper -- deciding what is reliable enough to be admitted and what is to be excluded. Those are not credibility determinations that are the province of the jury, but rather legal determinations about the reliability of the expert's methodology. We now reinforce the rigor expected of the trial court in that role under our existing case law.

Charged with determining whether to admit expert testimony, the trial court is responsible for advancing the truth-seeking function of our system of justice, while still allowing for new or developing opinions on medical causation that may not yet have gained general acceptance. See Rubanick, 125 N.J. at 436-38, 593 A.2d 733 (discussing commentators' and courts' acknowledgement of areas in medical causation where scientific method cannot accommodate general acceptance standard). Resolved not to stifle innovation in the tort system in such areas, this Court crafted its own broadened approach to the demonstration of reliability for an expert's testimony. The trial court is the spigot that allows novel expert testimony in areas of evolving medical causation science, provided the proponent of the expert can demonstrate that the expert adheres to scientific norms in distinct ways that we have identified.

In Rubanick, we said that the court must ensure compliance with the requirement of "some expert consensus that the methodology and the underlying data are generally followed by experts in the field." Id. at 450, 593 A.2d 733. In Landrigan, we charged the trial court with the obligation to "distinguish scientifically sound reasoning from that of the self-validating expert." 127 N.J. at 414, 605 A.2d 1079. And, in Kemp, we reinforced the prohibition against allowing in "unsubstantiated personal beliefs." See 174 N.J. at 427, 809 A.2d 77.

The gatekeeping role requires care. The process of making such determinations is "complicated," and we knew it would be "difficult." Rubanick, 125 N.J. at 449, 593 A.2d 733. The gatekeeping role necessitates examination of a methodology espousing a new theory in medical cause-and-effect cases. See id. at 451-53, 593 A.2d 733 (distinguishing between methodology and credibility assessments). Properly exercised, the gatekeeping function prevents the jury's exposure to unsound science through the compelling voice of an expert. See State v. Cavallo, 88 N.J. 508, 518, 443 A.2d 1020 (1982) ("The danger of prejudice through introduction of unreliable expert evidence is clear. While juries would not always accord excessive weight to unreliable expert testimony, there is substantial danger that they would do so, precisely because the evidence is labeled 'scientific' and 'expert.' "). As explained in Landrigan, "the key to admission of the opinion is the validity of the expert's reasoning and methodology." 127 N.J. at 414, 605 A.2d 1079.

Difficult as it may be, the gatekeeping role must be rigorous. In resolving issues of reliability of an expert's methodology in a new and evolving area of medical causation, we cautioned that "the trial court should not substitute its judgment for that of the relevant scientific community. The court's function is to distinguish scientifically sound reasoning from that of the self-validating expert, who uses scientific terminology to present unsubstantiated personal beliefs." Ibid. We have repeatedly stressed that the gatekeeper's "critical determination is whether comparable experts accept the soundness of the methodology, including the reasonableness of relying on [the] type of underlying data and information." Rubanick, 125 N.J. at 451, 593 A.2d 733 ; see Landrigan, 127 N.J. at 417, 605 A.2d 1079.

That said, we can and should have more clear direction to courts on how the gatekeeping function is properly performed. Recognizing proper gatekeeping when it is performed provides a discernible pathway for other courts to follow. We endeavor to do that with this matter. We add further clarification and assistance to trial courts, concerning performance of the gatekeeping role, when reviewing scientific expert testimony involving medical causation issues in civil matters, later in this opinion through our adoption of the Daubert factors for permissible use in such matters. See, infra, Section VII (op. at 396-400, 191 A.3d at 593-96).

VI.

In turning back to the matter before us to consider the trial court's exclusion of plaintiffs' experts' testimony, we begin by addressing the appropriate standard of review.

A.

A reviewing court must apply an abuse of discretion standard to a trial court's determination, after a full Rule 104 hearing, to exclude expert testimony on unreliability grounds. Hisenaj v. Kuehner, 194 N.J. 6, 12, 16, 942 A.2d 769 (2008). Here, the Appellate Division was persuaded to veer off that standard of review.

The Appellate Division stated that, although a trial court's decision to admit or exclude evidence is subject to an abuse of discretion standard, a reviewing court owes "somewhat less deference to a trial court's determination" regarding expert testimony. In re Accutane, 451 N.J. Super. at 197, 165 A.3d 832. However, as support for that proposition, the panel relied on State v. Torres, which was a criminal case that applied the Frye standard in determining the admissibility of expert witness testimony. 183 N.J. 554, 568, 874 A.2d 1084 (2005) (stating that "the field of inquiry must be generally accepted such that an expert's testimony would be sufficiently reliable" for expert testimony to be admissible); see also State v. J.R., 227 N.J. 393, 410, 152 A.3d 180 (2017) (relying on Torres for proposition that appellate court need not be as deferential on admissibility of expert scientific evidence).

That proposition has, to date, carried weight in the context of a court applying the general acceptance test in a criminal matter, see State v. Harvey, 151 N.J. 117, 167-70, 699 A.2d 596 (1997), but it is not appropriate in the context of a civil mass tort case, where the trial court has been entrusted with methodology-based review as the gatekeeper of expert testimony. Neither Rubanick, nor Landrigan, nor Kemp speaks to any such less-deferential standard, and this Court has continued to apply a pure abuse of discretion standard in civil matters concerning expert testimony. See, e.g., Townsend v. Pierre, 221 N.J. 36, 52-53, 110 A.3d 52 (2015) ("As this Court has noted, 'we apply [a] deferential approach to a trial court's decision to admit expert testimony, reviewing it against an abuse of discretion standard.' " (alteration in original) (quoting Pomerantz Paper Corp. v. New Cmty. Corp., 207 N.J. 344, 371-72, 25 A.3d 221 (2011) ) ).

We reaffirm that the abuse of discretion standard applies in the appellate review of a trial court's determination to admit or deny scientific expert testimony on the basis of unreliability in civil matters. We now apply that standard to the determination of the trial court in this matter.

B.

An expert must demonstrate the validity of his or her reasoning. Landrigan, 127 N.J. at 414, 605 A.2d 1079. Defendants argued, and presented experts who supported the argument, that plaintiffs' experts failed to meet our expert witness standard because they applied a contradictory and selective form of reasoning in which they used certain arguments to discredit the epidemiological evidence, only to abandon those arguments when relying on weaker evidence. The trial court essentially agreed with that assessment in conducting a properly performed robust analysis of the methodology advanced by plaintiffs' experts.

The trial court concluded that Dr. Kornbluth failed to persuade it of the soundness of his reasoning in support of his causation opinion. As his is the only causation opinion proffered, we focus on the court's rejection of Dr. Kornbluth's methodology used to support his opinion, particularly the court's reasoning and the bases on which the testimony was declared to be unsound. Although Dr. Kornbluth offered the only opinion on causation of plaintiffs' two experts, both experts offered interlocking expert testimony, and so our analysis of both is intertwined. The following assessments were made by the trial court in reaching its decision to exclude the testimony, and are supported by the record in this case.

Both Doctors Madigan and Kornbluth employed a methodology whereby they disregarded eight of nine epidemiological studies and relied on case reports and animal studies to support their opinion. It is clear that case reports are "at the bottom of the evidence hierarchy," Reference Manual at 724, and other courts have been skeptical of their value in proving causation, see, e.g., Allison v. McGhan Med. Corp., 184 F.3d 1300, 1316 (11th Cir. 1999) (holding that district court did not abuse its discretion by "discounting [expert's] reliance on case reports in the face of the overwhelming contrary epidemiological evidence presented"); Siharath v. Sandoz Pharms. Corp., 131 F.Supp.2d 1347, 1361 (N.D. Ga. 2001) (commenting, when rejecting experts' reliance on case reports in the absence of statistically significant epidemiological studies, that generally "[c]ase reports are not reliable scientific evidence of causation, because they simply describe[ ] reported phenomena without comparison to the rate at which the phenomena occur in the general population or in a defined control group; do not isolate and exclude potentially alternative causes; and do not investigate or explain the mechanism of causation." (second alteration in original) (quoting Casey v. Ohio Med. Prods., 877 F.Supp. 1380, 1385 (N.D. Cal. 1995) ) ). And while animal studies may be helpful in "framing hypotheses," the Reference Manual intimates that such evidence is far less probative in the face of a "substantial body of epidemiologic evidence." Id. at 563, 564-65, 564 n.48. Such was the case here; initial animal studies may have suggested a possible causal connection between Accutane and Crohn's disease, but since that time a uniform body of epidemiological evidence has dispelled any such theory. See Siharath, 131 F.Supp.2d at 1366-68 (explaining why court viewed animal studies with suspicion in absence of confirmatory epidemiological studies and adding generally that extrapolation to humans is "not considered reliable in the absence of a credible scientific explanation of why such extrapolation is warranted"). Thus, we do not mean to suggest that animal studies and case reports can never be relied upon for forming an opinion on causation, but we find ample support for the trial court's determination that it was not proper to do so here in light of the uniform body of epidemiological evidence.

As for the one study on which they did rely -- Sivaraman -- plaintiffs' experts disagreed with the authors' ultimate conclusions and instead focused on unadjusted results. In so doing, plaintiffs' experts dismissed published studies examining thousands of subjects as underpowered and biased in favor of relying on portions of a single unpublished study that examined 509 total subjects. In explaining their reasoning for disregarding the results of so many epidemiological studies that consistently failed to show an association between Accutane and Crohn's disease, plaintiffs' experts asserted that the prodrome for Crohn's disease is two years or more. They based that assessment on two smaller studies, Pimentel and Barratt, even though there were other larger studies, such as Chouraki, which had found a much shorter median prodromal period -- a median of approximately less than one year. Thus, despite their expressed concerns regarding study power -- which is based in part on the size of the study at issue -- in rejecting the epidemiological studies, plaintiffs' experts were willing to ignore any such concern when relying on the Pimentel and Barratt studies to form their opinion as to the median prodromal period. Moreover, as the trial court observed, plaintiffs' experts could have performed a meta-analysis to account for the power issue but refused.

The many contradictions in the experts' methodology were not lost on the trial court, which concluded that experts in the scientific community would not accept as consistent with scientific norms a methodology such as that used by plaintiffs' experts. In particular, the court found the methodology unsound because it relied on Sivaraman and Pimentel to the exclusion of other evidence. The trustworthiness of plaintiffs' experts' methodology was further undermined by internal inconsistencies, including the experts' refusal to examine the Rashtak and Fenerty studies on the ground that those studies did not report specific data for Crohn's disease while Dr. Kornbluth tethered his own causation opinion to case reports not specific to Crohn's disease and to studies performed on animals incapable of having any form of irritable bowel disease.

The trial court reasoned that the overall approach taken by Dr. Kornbluth -- rejecting the evidence from the epidemiological studies, which all found no causal association, and proffering his own alternative opinion that a causal association was present based on lesser forms of evidence -- was based on an unsound methodology. That conclusion comports with the decisions of many other courts that experts cannot selectively choose lower forms of evidence in the face of a large body of uniform epidemiological evidence. See, e.g., In re Lipitor, 892 F.3d 624, 634 (4th Cir. 2018) ("Result-driven analysis, or cherry-picking, undermines principles of the scientific method and is a quintessential example of applying methodologies (valid or otherwise) in an unreliable fashion."); Allison, 184 F.3d at 1316 ; Freeman v. Hoffman-La Roche, Inc., 300 Neb. 47, 911 N.W.2d 591, 596-98 (2018) (holding that trial court did not abuse its discretion in excluding expert testimony where expert employed inconsistent methodology and "cherry[-]pick[ed] studies from an overwhelmingly contrary body of literature"). The trial court found it to be demonstrably contrary to the recognized hierarchy of evidence in the Reference Manual and accepted generally in the scientific community. Indeed, even plaintiffs' experts acknowledged the scientific community's acceptance of a hierarchy of evidence, but they deviated from it. Moreover, as the court noted and found to be compelling, Dr. Kornbluth never submitted his ideas concerning biological mechanism or Accutane's relation to Crohn's disease for peer review or publication. In fact, he previously expressed concern, in an e-mail to a colleague admitted into the record in this matter, about any connection between Accutane and Crohn's disease after the Bernstein study, the first of the epidemiological studies, was published in 2009.

Finally, it bears noting that Dr. Kornbluth organized his testimony to support his personal view that a causal association existed between Accutane and Crohn's disease through use of the Hill guidelines. However, those guidelines are invoked only after an association between an agent and a particular disease has been determined to be present; their pointed purpose is to determine whether a detected association reflects true causality, it is not to create an association that has not already been detected through appropriate studies. See Reference Manual at 598-99 (explaining that Hill "guidelines are employed only after a study finds an association to determine whether that association reflects a true causal relationship"). Here, not one of the epidemiological studies found any statistically significant association between Accutane and Crohn's disease.

In sum, the trial court explained its reasons for concluding that plaintiffs' experts deviated from core scientific principles and strayed from their own claimed methodology in order to reach their conclusions. That the trial court deemed their testimony to be unreliable and excluded it from being presented is unsurprising. Ample evidence in the record supports that conclusion. Applying the abuse of discretion standard and the principles of Rubanick, Landrigan, and Kemp, we conclude that the trial court's determination is unassailable. The Appellate Division judgment, reversing the trial court's exclusion of the expert testimony, is reversed.

VII.

The divergent outcomes reached by the Appellate Division and the trial court in this matter provoked the debate among the parties and amici over whether our case law on the gatekeeping function is in need of clarification. Further, we are asked to consider adopting the Daubert standard, or to at least incorporate use of its factors, as a means to bring greater consistency to the gatekeeping function.

First, in respect of the gatekeeping role, we emphasize that we expect the trial court to assess both the methodology used by the expert to arrive at an opinion and the underlying data used in the formation of the opinion. That will ensure that the expert is adhering to norms accepted by fellow members of the pertinent scientific community. Methodology, in all its parts, is the focus of the reliability assessment, not outcome. See Clark v. Safety-Kleen Corp., 179 N.J. 318, 337, 845 A.2d 587 (2004) (" Rubanick changed the focus of the inquiry from the scientific community's acceptance of the substance of the opinion to its acceptance of the methodology and reasoning underlying it.").

It is not for a trial court to bless new "inspired" science theory; the goal is to permit the jury to hear reliable science to support the expert opinion. Cf. Rosen v. Ciba-Geigy Corp., 78 F.3d 316, 319 (7th Cir. 1996) ("[T]he courtroom is not the place for scientific guesswork, even of the inspired sort."). In this basic goal, there is not much light between our standard and that which has developed in the federal sphere under Daubert's initial instruction. Our law moved first, but in the same direction and with the same general goal as that noted by the Supreme Court in its Daubert trilogy. Both lines of initial decisions recognized the drawback of limiting expert testimony exclusively through the filter of general acceptance. For this Court, certain areas of law cried out for greater flexibility. For the federal courts, a broader reach was extended.

Importantly, both our law and the Daubert trilogy are aligned in their general approach to a methodology-based test for reliability. Both ask whether an expert's reasoning or methodology underlying the testimony is scientifically valid. Daubert, 509 U.S. at 594-95, 113 S.Ct. 2786 (explaining that inquiry into expert witness testimony requires examination of "scientific validity" of "the principles that underlie a proposed submission" and that court's focus "must be solely on principles and methodology"); Rubanick, 125 N.J. at 449, 593 A.2d 733 (holding that scientific theory of causation may be found sufficiently reliable where "it is based on a sound, adequately-founded methodology"). Moreover, both standards look to whether that reasoning or methodology properly can be applied to facts in issue.

Daubert, 509 U.S. at 591, 113 S.Ct. 2786 (explaining that Rule 702 requires that there be proper "fit" between expert testimony and facts of case and that expert testimony must be sufficiently tied to facts of case in order to aid jury in resolving matters at issue (citing United States v. Downing, 753 F.2d 1224, 1242 (3d Cir. 1985) ) ); Rubanick, 125 N.J. at 449, 593 A.2d 733 ("The expert must possess a demonstrated professional capability to assess the scientific significance of the underlying data and information, to apply the scientific methodology, and to explain the bases for the opinion reached." (emphasis added) ).

Importantly, Daubert identified a non-exhaustive list of factors for courts to consider using, if helpful, when it expanded on its test for assessing the reliability of scientific expert testimony. See 509 U.S. at 593-95, 113 S.Ct. 2786. Distilled, the general factors identified as perhaps pertinent for consideration, but not dispositive or exhaustive, are:

1) Whether the scientific theory can be, or at any time has been, tested;
2) Whether the scientific theory has been subjected to peer review and publication, noting that publication is one form of peer review but is not a "sine qua non";
3) Whether there is any known or potential rate of error and whether there exist any standards for maintaining or controlling the technique's operation; and
4) Whether there does exist a general acceptance in the scientific community about the scientific theory.

That last consideration -- general acceptance in the scientific community -- continues to have a bearing because, minimally, it permits the identification of a relevant scientific community and facilitates an express determination of a particular degree of acceptance within that community, or contrarily permits a technique with minimal support to be viewed with skepticism. See id. at 594, 113 S.Ct. 2786.

We are persuaded that the factors identified originally in Daubert should be incorporated for use by our courts. The factors dovetail with the overall goals of our evidential standard and would provide a helpful -- but not necessary or definitive -- guide for our courts to consider when performing their gatekeeper role concerning the admission of expert testimony. Several are aimed at achieving the same examination for peer acceptance of a methodology (but not the outcome reached from that methodology) described in our earlier opinions. See Landrigan, 127 N.J. at 417, 605 A.2d 1079 ; Rubanick, 125 N.J. at 449-50, 593 A.2d 733.

In adopting use of the Daubert factors, we stop short of declaring ourselves a " Daubert jurisdiction." Like several other states, we find the factors useful, but hesitate to embrace the full body of Daubert case law as applied by state and federal courts. See, e.g., People v. Shreck, 22 P.3d 68, 70 (Colo. 2001) (en banc) (abandoning Frye test and holding that trial court "may consider" Daubert factors); McDaniel v. CSX Transp., 955 S.W.2d 257, 265 (Tenn. 1997) (finding factors useful in determining reliability but not expressly adopting Daubert ).

First, we have already broadened our approach to testing for the reliability of expert testimony for certain areas in civil law, see Kemp, 174 N.J. at 430, 809 A.2d 77 ; but, to date, we retain the general acceptance test for reliability in criminal matters, see Harvey, 151 N.J. at 167-70, 699 A.2d 596. Second, there is no monolithic body of case law uniformly or even consistently applying Daubert, as others have noted. See, e.g., Motorola, 147 A.3d at 757. We hesitate to sweep in adherence to the various approaches taken among the circuits and state jurisdictions when applying the Daubert factors. Thus, we do not adopt a "standard" that we cannot fully discern in its application at this time. While the factors are helpful, and while individual cases may be persuasive in appropriate settings, we cannot ignore that there are discordant views about the gatekeeping role among Daubert jurisdictions. See ibid.; see generally David E. Bernstein & Eric G. Lasker, Defending Daubert: It's Time to Amend Federal Rule of Evidence 702, 57 Wm. & Mary L. Rev. 1, 26-36 (2015).

Our view of proper gatekeeping in a methodology-based approach to reliability for expert scientific testimony requires the proponent to demonstrate that the expert applies his or her scientifically recognized methodology in the way that others in the field practice the methodology. When a proponent does not demonstrate the soundness of a methodology, both in terms of its approach to reasoning and to its use of data, from the perspective of others within the relevant scientific community, the gatekeeper should exclude the proposed expert testimony on the basis that it is unreliable. See, e.g., In re Lipitor, 892 F.3d at 634; Lust v. Merrell Dow Pharms., Inc., 89 F.3d 594, 598 (9th Cir. 1996) (noting that courts should be wary that expert has not faithfully applied methodology where expert's conclusions are "anomalous"); In re Rezulin Prods. Liab. Litig., 369 F.Supp.2d 398, 425 (S.D.N.Y. 2005) (explaining that courts typically exclude testimony from experts who selectively choose from scientific landscape); Freeman, 911 N.W.2d at 596-98.

Importantly, that approach -- namely, to determine whether the scientific community would accept the methodology employed by plaintiffs' experts and would use the underlying facts and data as did plaintiffs' experts -- was employed by the trial court here. We approve of the court's methodological analysis. There was no encroachment on the factfinding function here. The trial court did the type of rigorous gatekeeping that is necessary when faced with a novel theory of causation, particularly one, as here, that flies in the face of consistent findings of no causal association as determined by higher levels of scientific proof.

For all the reasons expressed herein, we conclude that the trial court's exclusion of plaintiffs' experts' testimony was well-supported and well-reasoned. There was no abuse of discretion by the trial court in its evidential ruling.

VIII.

We reverse the judgment of the Appellate Division, which had reversed the trial court's exclusion of plaintiffs' experts' testimony.

CHIEF JUSTICE RABNER and JUSTICES ALBIN, FERNANDEZ-VINA, SOLOMON, and TIMPONE join in JUSTICE LaVECCHIA's opinion. JUSTICE PATTERSON did not participate. 
      
      Frye v. United States, 293 F. 1013 (D.C. Cir. 1923).
     
      
      Daubert v. Merrell Dow Pharms., Inc., 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993).
     
      
      As noted in the Appellate Division's opinion in this matter, this case is one of many mass tort cases spanning over a decade relating to Accutane. In re Accutane Litig., 451 N.J. Super. 153, 164-65 & n.6, 165 A.3d 832 (App. Div. 2017).
     
      
      Three sections of the Reference Manual are relevant to this appeal: the Reference Guide on Statistics, by David H. Kaye and David A. Freedman; the Reference Guide on Epidemiology by Michael D. Green et al.; and the Reference Guide on Medical Testimony by John B. Wong et al. For ease of reference, we refer to the Reference Manual as a single document.
     
      
      As the Reference Manual explains, a randomized trial is performed by randomly assigning study subjects into one of two groups: one group is exposed to the agent in question and the other group is not. Id. at 555. The group not exposed to the agent is given a placebo, an inactive ingredient. Ibid. The feasibility of such studies is often limited due to the potentially harmful side effects associated with a particular agent. Ibid.
     
      
      The guidelines initially were proposed by the Surgeon General in 1964 and were later expanded upon by Sir Austin Bradford Hill in 1965. Reference Manual at 600.
     
      
      Charles N. Bernstein et al., Isotretinoin is Not Associated with Inflammatory Bowel Disease: A Population-Based Case-Control Study, 104 Am. J. Gastroenterol. 2774 (2009).
     
      
      Seth D. Crockett et al., Isotretinoin Use and the Risk of Inflammatory Bowel Disease: A Case-Control Study, 105 Am. J. Gastroenterol. 1986 (2010).
     
      
      Mahyar Etminan et al., Isotretinoin and Risk for Inflammatory Bowel Disease, 149 JAMA Dermatol. 216 (2013).
     
      
      A meta-analysis is a form of epidemiological study whereby the study authors pool separate studies together and then interpret the results. See Reference Manual at 289.
     
      
      Raed O. Alhusayen et al., Isotretinoin Use and the Risk of Inflammatory Bowel Disease: A Population-Based Cohort Study, 133 J. Investigative Dermatol. 907 (2013).
     
      
      Sarah Fenerty et al., Impact of Acne Treatment on Inflammatory Bowel Disease, 68 J. Am. Acad. Dermatol. AB5 (2013).
     
      
      Shadi Rashtak et al., Isotretinoin Exposure and Risk of Inflammatory Bowel Disease, 150 JAMA Dermatol. 1322 (2014).
     
      
      Antoine Racine et al., Isotretinoin and Risk of Inflammatory Bowel Disease: A French Nationwide Study, 109 Am. J. Gastroenterol. 563 (2014).
     
      
      Mark Pimentel et al., Identification of a Prodromal Period in Crohn's Disease but Not Ulcerative Colitis, 95 Am. J. Gastroenterol. 3458 (2000).
     
      
      S.M. Barratt et al., Prodromal Irritable Bowel Syndrome May Be Responsible for Delays in Diagnosis in Patients Presenting with Unrecognized Crohn's Disease and Celiac Disease, but Not Ulcerative Colitis, 56 Dig. Dis. Sci. 3270 (2011).
     
      
      As plaintiffs' experts explained, a point estimate serves as a way to calculate a single value for a sample of data and is the researchers' "best guess" as to the level of risk of a specific health effect from the substance being studied. A point estimate of 1.0 is indicative of no effect; a point estimate above 1.0 is indicative of increased risk; and a point estimate below 1.0 is indicative of decreased risk. For more information, see generally Reference Manual at 292. We note that the experts in this case do not distinguish between the term "point estimate" and the related concepts of "relative risk" and "odds ratio." For a discussion of those concepts, see generally id. at 566-69.
     
      
      As plaintiffs' experts explained, statistical significance is measured through use of a confidence interval, which reflects a range of possible values calculated from the results of a particular study. A confidence interval that ranges from below 1.0 to above 1.0 is considered not to be statistically significant. Thus, even where a study's point estimate is indicative of either a decreased or increased risk, that result would not be considered statistically significant if the confidence interval ranges both above and below 1.0. On the other hand, a study will be considered statistically significant where the confidence interval is entirely above or entirely below 1.0. For more information, see generally Reference Manual at 574-83, 621.
     
      
      To summarize his detailed explanation, Dr. Kornbluth asserts that retinoic acid, a breakdown product of Accutane, marks inflammatory cells known as "T cells" with a compound known as "alpha 4 beta 7." That binding process allows the inflammatory T-cells to then travel through the digestive tract and bind to another receptor known as "MadCAM." The process of inflammatory T-cells traveling through the digestive tract and binding to the other receptors on the intestinal wall then creates the inflammation that results in Crohn's disease. He also explained that two drugs currently used in treating Crohn's disease -- vedolizumab and natalizumab -- are believed to work by blocking alpha 4 beta 7 from binding to the T-cells.
     
      
      As Dr. Madigan explained, a power analysis examines for the risk that a study's outcome was a "false negative."
     
      
      Deepa Reddy et al., Possible Association Between Isotretinoin and Inflammatory Bowel Disease, 101 Am. J. Gastroenterol. 1569 (2006).
     
      
      V. Chouraki et al., The Changing Pattern of Crohn's Disease Incidence in Northern France: a Continuing Increase in the 10-to 19-Year-Old Age Bracket (1988-2007), 33 Aliment. Pharmacol. Ther. 1133 (2011).
     
      
      The trial court's determination was based solely on plaintiffs' experts' methodology. The experts' credentials were not in issue at any point.
     
      
      As Frye garnered considerable criticism through the years, the United States Supreme Court ultimately resolved a split among the circuit courts and held that Frye was superseded by the adoption of the Federal Rules of Evidence. Daubert, 509 U.S. at 585-87, 113 S.Ct. 2786. We discuss Daubert in greater detail later in this opinion.
     
      
      The Court also acknowledged that other evidence rules pertain in the analysis in addition to Rule 702, including Rules 703 and 403. Id. at 595, 113 S.Ct. 2786.
     
      
      In 1992, this Court adopted N.J.R.E. 702 to replace Evidence Rule 56(2) and tracked the language of the then-existing version of Federal Rule of Evidence 702. The Official Comment to N.J.R.E. 702 notes that our Rule followed the then-existing federal rule verbatim, with a minor language change.
     
      
      The trial court explained that the Sivaraman study appeared only as an abstract in the American Journal of Gastroenterology, and a written report detailing the study's findings has never been published. As defendants noted, the study remains unpublished today, and it has not been cited in any published opinion other than the present case.
     