
    2004-NMSC-027
    96 P.3d 291
    Kevin LEE, William Vance Langley, Teri Bogey, Teodoro Jose Gallegos, and Erlinda Saiz, Petitioners, v. Hon. Lourdes MARTINEZ, Third Judicial District Court, Hon. Jay Forbes, Fifth Judicial District Court, Hon. Charles Currier, Fifth Judicial District Court, Hon. Timothy L. Garcia, First Judicial District Court, and Hon. W. John Brennan, Second Judicial District Court, Respondents.
    No. 27,915.
    Supreme Court of New Mexico.
    July 14, 2004.
    
      John Bigelow, Chief Public Defender, Jeff Rein, Freedman, Boyd, Daniels, Hollander, Goldberg & Cline, P.A., Charles W. Daniels, Albuquerque, for Petitioners.
    Patricia A. Madrid, Attorney General, Margaret McLean, Assistant Attorney General, Santa Fe, for Respondents.
    University of New Mexico School of Law, Michael B. Browde, Albuquerque, for Amicus Curiae New Mexico Trial Lawyers Association.
    Vaughan & DeMuro, Gordon L. Vaughan, Colorado Springs, CO, William E. Parnall, Albuquerque, for Amicus Curiae American Polygraph Association and American Association of Police Polygraphists.
   OPINION

MINZNER, Justice.

{1} Petitioners are defendants in several pending criminal eases who are seeking to have their polygraph examination results admitted into evidence under Rule 11-707(C) NMRA 2004, which states that “the opinion of a polygraph examiner may in the discretion of the trial judge be admitted as evidence as to the truthfulness of any person called as a witness,” provided certain conditions are met. In each case the State has opposed the admission of such polygraph evidence on the ground that it fails to satisfy the standard for the admissibility of expert testimony set forth in Rule 11-702 NMRA 2004. On February 10, 2004, Petitioners filed a Petition for Writ of Superintending Control asking this Court to order the district courts to comply with Rule 11-707, rather than conducting a separate Rule 11-702 hearing in each case.

{2} On April 14, 2003, we granted Petitioners’ request for a writ pursuant to Rule 12-504 NMRA 2004 and Article VI, Section 3 of the New Mexico Constitution. In our order, we remanded the cases to the Honorable Richard J. Knowles of the Second Judicial District “for the limited purpose of conducting an evidentiary hearing as to the scientific reliability of polygraph evidence under State v. Alberico, 116 N.M. 156, 861 P.2d 192 (1993), State v. Anderson, 118 N.M. 284, 881 P.2d 29 (1994), and State v. Torres, 1999-NMSC-010, 127 N.M. 20, 976 P.2d 20.” The district court held a seven-day evidentiary hearing in order to determine whether polygraph evidence should be admissible.

{3} On August 25, 2003, the district court filed its Findings of Fact and Conclusions of Law. In addition to its legal conclusions, the district court’s order contained a thorough description of the polygraph examination and a comprehensive review of how other jurisdictions have treated polygraph evidence. The district court’s Findings of Fact and Conclusions of Law are attached as an appendix. First, the district court concluded polygraph results are not sufficiently reliable to satisfy Rule 11-702. Second, the district court concluded that “the limited probative value [of] polygraph test results is substantially outweighed by the danger of confusion of the issues, undue delay, and waste of time” rendering such results inadmissible under Rule 11-403 NMRA 2004. Third, the district court cited authority for the proposition that polygraph testimony is inadmissible under Rule 11-608(B) NMRA 2004, which generally provides that “[sjpecific instances of the conduct of a witness, for the purpose of attacking or supporting the witness’s credibility ... may not be proved by extrinsic evidence.”

{4} We now must consider whether to repeal our Rule 11-707 and hold that polygraph results are per se excluded. For the reasons that follow in this opinion, we do not repeal Rule 11-707. Instead, we hold that polygraph examination results are sufficiently reliable to be admitted under Rule 11-702, provided the expert is qualified and the examination was conducted in accordance with Rule 11-707. Therefore, we exercise our power of superintending control to order the district courts in the pending cases to comply with Rule 11-707 in determining whether to admit polygraph examination results. The proponents of such polygraph evidence are not required to independently establish the reliability of the examiner’s testimony in a Daubert /Alberico hearing.

{5} We do not address the admissibility of the polygraph results in the pending cases under Rule 11-403 because it would be inappropriate for this Court to categorically exclude any type of evidence under that rule. See Ohlson v. Kent Nowlin Const. Co., 99 N.M. 539, 542, 660 P.2d 1021, 1024 (Ct.App. 1983) (“There is, and can be, no fixed rule delineating relevant and irrelevant evidence. The problem must be decided on a case-by-case basis.”). Furthermore, Rule 11-707(C) specifically provides that the admissibility of polygraph results is subject to “the discretion of the trial judge.” We believe that the district court in its discretion may properly exclude polygraph results when the probative value of such results “is substantially outweighed by the danger of unfair prejudice, confusion of the issues or misleading the jury, or by considerations of undue delay, waste of time or needless presentation of cumulative evidence.” Rule 11-403. However, it would be an abuse of discretion for the district court to apply Rule 11-403 to exclude polygraph results that were conducted in accordance with Rule 11-707 if the district court’s reasons for excluding the evidence are grounded in a general disbelief in the reliability of polygraph results or a general hostility toward polygraph evidence.

{6} We also decline to address the applicability of Rule 11-608(B) because the issue was not raised in the Petition for a Writ of Superintending Control and was not extensively briefed by the parties. However, we note that Rule 11-608(B) deals with character evidence. Rule 11-707(0 states that “the opinion of a polygraph examiner may ... be admitted as evidence as to the truthfulness of any person called as a witness.” (Emphasis added.) If, as Rule 11-707(C) seems to allow, polygraph results are offered as character evidence, then Rule 11-707 may very well act as an exception to Rule 11-608(B). Furthermore, polygraph results are not necessarily character evidence; the evidence may be offered as evidence of the examinee’s lack of consciousness of guilt, which would be admissible under Rule 11-404(B) NMRA 2004. See State v. Martinez, 1999-NMSC-018, ¶29, 127 N.M. 207, 979 P.2d 718 (“[Cjonsciousness of guilt, like intent or motive, constitutes a permissible use of other acts or wrongs under Rule 11-404(B)”). At any rate, we need not decide the issue in this opinion.

I. THE POLYGRAPH EXAMINATION

{7} The National Academy of Sciences (“NAS”), a private, non-profit society of distinguished scientists and engineers that advises the federal government on scientific and technical matters, recently conducted a review of the validity of polygraph testing. The published report of the NAS provides a detailed description of the various polygraph testing techniques, sets forth the basic scientific theories underlying the polygraph examination, and objectively reviews the scientific literature on the reliability of polygraph examinations. See National Research Council of the National Academies, The Polygraph and Lie Detection (2003), available at http:/1www. nap. edu/openbook/0309081369/html [hereinafter “NAS Report”]. The NAS Report contributed greatly to our understanding of the underlying science of the polygraph examination and was immensely helpful to our resolution of the issues in this case. In this section, we rely heavily on the NAS Report in describing the modern polygraph examination.

{8} The polygraph instrument records “physiological responses that are believed to be stronger during acts of deception than at other times.” Id. at 13. These physiological responses include cardiovascular activity, electrodermal activity (electrical conductance at the skin surface), and respiratory activity. See id. at 286-89 (describing in detail the physiological processes measured by the polygraph). In general, a polygraph examination consists of “a series of yes/no questions to which the examinee responds while connected to sensors that transmit data on these physiological phenomena by wire to the instrument, which uses analog or digital technology to record the data.” Id. at 13. “[T]he record of physiological responses during the polygraph test is known as the polygraph chart.” Id. The polygraph examination is based on the theory that “a deceptive response to a question causes a reaction-such as fear of detection or psychological arousal — that changes respiration rate, heart rate, blood pressure, or skin conductance relative to what they were before the question was asked.” Id.

{9} Three different polygraph questioning techniques have been developed. First, in the “relevant/irrelevant” technique, the ex-aminee is asked two different types of questions — “the relevant questions are typically very specific and concern an event under investigation”; whereas, “[t]he irrelevant questions may be completely unrelated to the event and may offer little temptation to deceive.” Id. at 14. A deceptive person is expected to have a stronger physiological response to the relevant questions than to the irrelevant questions. Id. Second, in the “control question technique” or “comparison question technique,” instead of coupling the relevant questions with irrelevant questions, the irrelevant questions are replaced with control questions “intended to generate physiological reactions even in nondeceptive examinees.” Id. An example of a control question might be, “Have you ever lied to a friend?” Truthful examinees are expected to experience stronger physiological responses to the control questions; whereas, deceptive examinees are expected to experience stronger physiological responses to the relevant questions. See id. at 14-15. Third, in the “guilty knowledge polygraph test,” the examinee is asked a number of “questions about details of an event under investigation that are known only to investigators and those with direct knowledge of the event.” Id. at 15. Examinees are expected to experience the greatest physiological responses to those questions that accurately describe the event. Id.

{10} In this opinion, we address only polygraph examinations conducted using the control question technique because it appears that in each pending case below that technique was used. The control question technique is the most widely used questioning technique for evidentiary polygraph examinations. The relevant/irrelevant technique cannot be used because those examinations are not numerically scored. See Rule 11-707(C)(2) (providing that the opinion of a polygraph examiner can only be admitted if “the polygraph examination was quantitatively scored”). The guilty knowledge test is generally used in investigations and was not used in any of the cases pending below.

II. STANDARD OF REVIEW.

{11} As a preliminary matter, we must determine the level of deference which we will afford the district court’s findings of fact and conclusions of law. In general, “[t]he rule in this State has consistently been that the admission of expert testimony or other scientific evidence is peculiarly within the sound discretion of the trial court and will not be reversed absent a showing of abuse of that discretion.” Alberico, 116 N.M. at 169, 861 P.2d at 205. However, the procedural posture in which this case arose demands a heightened standard of review. Rather than issuing a ruling regarding the admissibility of expert testimony during the course of an individual trial, Judge Knowles was ordered by this Court to conduct a special evidentiary hearing. He properly viewed his role as that of a “special master.” Rule 1-053 NMRA 2004 allows for the appointment of a special master by any court in which an action is pending. As a special master, Judge Knowles had the power to require the production of certain evidence, rule upon the admissibility of evidence, and allow for the examination of witnesses. See Rule 1-053(C). We ordered Judge Knowles to file findings of fact and conclusions of law in this Court.

{12} Under Rule 1-053, the standard of review for findings of fact differs from those for conclusions of law. Lozano v. GTE Lenkurt, Inc., 1996-NMCA-074, ¶ 16, 122 N.M. 103, 920 P.2d 1057. “[T]he court shall accept the master’s findings of fact unless [they are] clearly erroneous.” Rule 1-053(E)(2). A master’s conclusions of law are reviewed de novo. Lozano, 1996— NMCA-074, ¶ 18, 122 N.M. 103, 920 P.2d 1057; see also Rule 1-053(E)(2) (“The court after hearing may adopt the [master’s] report or may modify it or may reject it in whole or in part or may receive further evidence or may recommit it with instructions.”). Therefore, it is clear that in reviewing Judge Knowles’ conclusions of law, “we exercise our own independent judgment without assigning special weight to [his] decision.” Martinez v. Friede, 2004-NMSC-006, ¶ 10,135 N.M. 171, 86 P.3d 596.

{13} It is less clear the standard of review that we should apply to Judge Knowles’ findings of fact. While Rule 1-053(E)(2) appears to require us to adopt an extremely deferential standard of review, Petitioners argue that the findings of fact should also be reviewed de novo because the findings are legislative facts, not adjudicative facts. “Legislative facts are those which help the tribunal to determine the content of law and policy and to exercise its judgment or discretion in determining what course of action to take.” Trujillo v. City of Albuquerque, 110 N.M. 621, 635, 798 P.2d 571, 585 (1990) (Montgomery, J., concurring in part, dissenting in part) (quoting Kenneth Culp Davis, Judicial Notice, 55 Colum. L.Rev. 945, 952 (1955)). Unlike adjudicative facts, legislative facts do not concern individual parties, such as who did what, when, where, and how. Id. Since Judge Knowles’ findings of fact were formulated to help this Court develop its policy regarding the admissibility of polygraph examination results,_ we conclude his findings are legislative in nature. As such, we will also review Judge Knowles’ findings of fact de novo.

III. RULE 11-702.

{14} The State forcefully argues that our Rule 11-707, which governs the admissibility of polygraph examination results, should be repealed in light of our analysis for the admissibility of expert testimony set forth in Alberico and its progeny. Neither this Court nor the Court of Appeals have applied the Daubert /Alberico analysis for the admissibility of expert testimony to polygraph results. We could hold Rule 11-707 acts as an exception to Rule 11-702, thus obviating the need for such expert testimony to satisfy Rule 11-702. Cf. Banks v. IMC Kalium Carlsbad Potash Co., 2003-NMSC-026, 134 N.M. 421, 77 P.3d 1014 (holding that the Daubert /Alberico analysis does not apply to the testimony of a health care provider regarding causation in administrative proceeding under the Workers’ Compensation Act). However, we refuse to do so without conducting a Daubert /Alberico analysis first:

Since a polygraph examiner renders an opinion about a subject that involves a scientific device that is purported to measure and record a number of involuntary body responses to the stress produced by knowing deception, [Rule 11-702] clearly has some bearing on the admissibility of polygraph evidence.

Leo M. Romero, The Admissibility of Scientific Evidence under the New Mexico and Federal Rules of Evidence, 6 N.M. L.Rev. 187, 197 (1976); cf Torres, 1999-NMSC-010, ¶ 31, 127 N.M. 20, 976 P.2d 20 (holding that the horizontal gaze nystagmus field test for sobriety is scientific evidence that must satisfy Rule 11-702 because the test “is based on principles of medicine and science not readily understandable to the jury”) (quoting State v. Meador, 674 So.2d 826, 834 (Fla.Dist.Ct.App.1996)).

{15} The purpose of Rule 11-702 is “to assist the trier of fact to understand the evidence and to determine the issues of fact.” Madrid v. Univ. of California, 105 N.M. 715, 718, 737 P.2d 74, 77 (1987). Scientific evidence can only assist the trier of fact if it is “grounded in valid, objective science” and is “reliable enough to prove what it purports to prove.” Alberico, 116 N.M. at 168, 861 P.2d at 204. If we held that polygraph evidence did not have to satisfy Rule 11-702, we would in effect be conceding that polygraph evidence is either not grounded in science or is not sufficiently reliable to assist the trier of fact. Such a holding would be inappropriate and unnecessary. Therefore, we take this opportunity to subject polygraph evidence to a proper Daubert/Alberico analysis in order to inform our determination on the continued vitality of Rule 11-707.

A. The Daubert /Alberico Analysis.

{16} In Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 588, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), the United States Supreme Court rejected the rigid “general acceptance” test for the admissibility of expert testimony first articulated in Frye v. United States, 293 F. 1013, 1014 (D.C.Cir.1923). The Court held that application of a rigid “general acceptance” test “would be at odds with the ‘liberal thrust’ of the Federal Rules and them ‘general approach of relaxing the traditional barriers to “opinion” testimony.’” Daubert, 509 U.S. at 588, 113 S.Ct. 2786 (quoting Beech Aircraft Corp. v. Rainey, 488 U.S. 153, 169, 109 S.Ct. 439, 102 L.Ed.2d 445 (1988)). This liberal approach to the admission of evidence is consistent with the intent of the drafters of the Federal Rules of Evidence. As one notable commentator has recognized:

Universality of education and the almost instantaneous dispersal of information through modern technology have created a citizenry with a remarkable and historically unique breadth of knowledge, perception, and sophistication. These mature men and women should be treated with the respect they deserve. Excluding information on the ground that jurors are too ignorant or emotional to evaluate it properly may have been appropriate in England at a time when a rigid class society created a yawning gap between royal judges and commoner jurors, but it is inconsistent with the realities of our modern American informed society and the responsibilities of independent thought in a working democracy.

1 Jack B. Weinstein & Margaret A. Berger, Weinstein’s Federal Evidence xix (2d ed.2003); see also State v. Mann, 2002-NMSC-001, ¶ 27, 131 N.M. 459, 39 P.3d 124 (“Jurors are generally knowledgeable in many areas, and they are entitled to use their common or acquired sense in arriving at a verdict ____”) (quoted authority omitted). Given the capabilities of jurors and the liberal thrust of the rules of evidence, we believe any doubt regarding the admissibility of scientific evidence should be resolved in favor of admission, rather than exclusion. See Brown v. Gen. Ins. Co. of Am., 70 N.M. 46, 54, 369 P.2d 968, 973 (1962).

{17} Rule 11-702 governs the admissibility of scientific evidence: In Alberico, 116 N.M. at 166, 861 P.2d at 202, we discerned three prerequisites in Rule 11-702 for the admission of expert opinion testimony. First, the expert must be qualified. Id. Second, the testimony must “assist the trier of fact.” Id. Third, the expert may only testify as to “scientific, technical or other specialized knowledge.” Id. The first two prerequisites are not at issue in this opinion. In each individual ease, the district court must determine whether the proffered expert is qualified under Rule 11-707 to give expert testimony on polygraph results. Additionally, there can be little doubt that polygraph evidence indicating that a defendant or witness is telling the truth or lying about a specific incident at issue would be helpful to the jury. Cf. Anderson, 118 N.M. at 296-97, 881 P.2d at 41-42 (concluding that DNA evidence linking the defendant to the crime scene was helpful to the jury). Thus, the focus of this opinion is on the reliability of the control question polygraph examination.

If scientific, technical or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education may testify thereto in the form of an opinion or otherwise.

{18} ”[U]nder the Rules the trial judge must ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable.” Id. at 291, 881 P.2d at 36 (quoting Daubert, 509 U.S. at 589, 113 S.Ct. 2786); accord Torres, 1999-NMSC-010, ¶ 26, 127 N.M. 20, 976 P.2d 20 (“[E]videntiary reliability is the hallmark for the admissibility of scientific knowledge.”). Thus, “the trial court must determine whether the scientific technique is based upon well-recognized scientific principle and whether it is capable of supporting opinions based upon reasonable probability rather than conjecture.” Alberico, 116 N.M. at 167, 861 P.2d at 203. In making this determination, we consider: “(1) whether a theory or technique can be (and has been) tested; (2) whether the theory or technique has been subjected to peer review and publication; (3) the known potential rate of error in using a particular scientific technique and the existence and maintenance of standards controlling the technique’s operation; and (4) whether the theory or technique has been generally accepted in the particular scientific field.” Anderson, 118 N.M. at 291, 881 P.2d at 36 (quotation marks and quoted authority omitted). We apply these factors to the control question polygraph examination to determine whether that evidence is sufficiently reliable to satisfy Rule 11-702.

B. Application of the Alberico Factors, i. Testability.

{19} We first address whether the polygraph examination can be tested, and if so, whether it has been tested. Id.; see also Daubert, 509 U.S. at 593, 113 S.Ct. 2786. “Scientific methodology today is based on generating hypotheses and testing them to see if they can be falsified; indeed, this methodology is what distinguishes science from other fields of human inquiry.” Daubert, 509 U.S. at 593, 113 S.Ct. 2786 (quoting authority omitted). Applying this factor to polygraph examinations, the district court concluded:

Polygraph test results and the conclusions derived from them are not based upon an overarching theory. To the extent it is merely argued that there is a hypothesis that the test reliably detects deception, that hypothesis has not been subjected to field research. The existing laboratory research, given the problems described [in the Findings of Fact], is woefully inadequate to support admissibility in court in real life contexts.

In reviewing the district court’s conclusion, we must determine whether a testable hypothesis has been generated for the control question polygraph, and if so, whether that hypothesis has in fact been tested.

{20} The hypothesis of the polygraph examination was discussed thoroughly in the NAS Report, which notes that a well-supported theory can provide confidence the polygraph can be accurate when used in novel situations and with different examinees. NAS Report, supra, at 66. Also, a theory is essential to providing confidence the polygraph will work well despite efforts by examinees to “beat the polygraph” through the use of various countermeasures. Id. Finally, “[a] solid theoretical and scientific base is also valuable for improving [the polygraph] test because it can identify the most serious threats to the test’s validity and the kinds of experiments that need to be conducted to assess such threats.” Id. at 69.

{21} The NAS Report notes that “Recording to contemporary theories of polygraph questioning, individuals who are being deceptive or truthful in responding to relevant questions show different patterns of physiological response when their reactions to relevant and comparison questions are compared.” Id. at 70. The specific theory of the control question technique is that an innocent person will show a greater physiological response to the control questions; but, a guilty person will react more strongly to the relevant questions. Id. The NAS Report states that in order to have a well-supported theory, “it is ... necessary to identify the relevant psychological states and to understand how those states are linked to characteristics of the test questions intended to create the states and to the physiological responses the states are said to produce.” Id. at 71-72. The current polygraph research, though, has focused almost exclusively on the applicability of the polygraph at the expense of developing the underlying science. Id. at 92. Specifically, “[t]here has been no systematic effort to identify the best potential physiological indicators on theoretical grounds or to update theory on the basis of emerging knowledge in psychology or physiology.” Id.

{22} Petitioners agree there is no scientifically testable hypothesis explaining all the psychophysiological variables occurring in the control question polygraph. However, Petitioners argue such an overarching theory is not necessary for polygraph results to be deemed admissible under Rule 11-702. We agree. The State’s primary witness admitted at the evidentiary hearing held below that people experience “emotional turmoil” when they are telling a lie, and these emotions can be detected by the polygraph machine. Also, despite its criticism of the current research on the polygraph, the NAS Report nonetheless concludes that “[b]asie scientific knowledge of psychophysiology offers support for expecting polygraph testing to have some diagnostic value, at least among naive examinees.” Id. at 101. The NAS Report further concludes that “Rlthough the basic science indicates that polygraph testing has inherent limits regarding its potential accuracy, it is possible for a test with such limits to attain sufficient accuracy to be useful in practical situations.” Id. at 102.

{23} As we noted in Anderson, “refutability” is the key criterion when analyzing the scientific theory or hypothesis underlying expert testimony. 118 N.M. at 297, 881 P.2d at 42. Under the facts of that case, in which we examined the admissibility of certain DNA evidence under Rule 11-702, we stated:

Defendants vociferously dispute the accuracy of the match results and the adequacy of the testing done, and in refutation have presented evidence about deficiencies in both the results and the testing of the results. Thus, it appears that by attempting to refute the FBI’s theory and methods with evidence about deficiencies in both the results and the testing of the results, the defendants have conceded that the theory and methods can be tested.

Id. (quoting United States v. Bonds, 12 F.3d 540, 559 (6th Cir.1993)). The State’s primary witness on the reliability of polygraphs testified there are numerous studies on polygraphs and their accuracy. By claiming that a number of those studies establish that polygraph examinations do not work, the State has implicitly conceded that the hypothesis underlying the control question polygraph can be tested. The State’s concession is supported by the NAS Report, which states “it is possible to do better field research than we have found in the literature and, over time, to use admittedly imperfect research designs, both experimental and observational, to advance knowledge and build methodological understanding, leading to better research design in the future.” NAS Report, supra, at 116.

{24} Based on the foregoing, we conclude that the control question polygraph examination can be tested. We believe the district court’s apparent finding to the contrary is erroneous. As was stated in United States v. Galbreth, 908 F.Supp. 877, 891 (D.N.M.1995), “[u]nlike an endeavor such as astrology, the scientific validity of which can never be empirically verified, it is possible to test [the control question] polygraph technique[ ].” We now turn to the published academic literature on the polygraph examination.

ii. Peer review and publication.

{25} The second factor we consider is whether the control question polygraph has been subjected to peer review and publication. Anderson, 118 N.M. at 291, 881 P.2d at 36. Peer review and publication is important because “submission to the scrutiny of the scientific community is a component of ‘good science,’ in part because it increases the likelihood that substantive flaws in- methodology will be detected.” Daubert, 509 U.S. at 593, 113 S.Ct. 2786. Regarding this factor, the district court concluded that the control question polygraph “has been subjected to limited peer review publication,” but that “the relevant publications do not enhance confidence in the test results, particularly considering the effectiveness of counter-measures.”

{26} The committee that prepared the NAS Report gathered and evaluated as many polygraph validation studies as possible. The committee located 217 research reports of 194 separate studies. NAS Report, supra, at 107. Of those studies, 102 were deemed of sufficient quality to be included in the committee’s review of the polygraph. Id. Each of these studies met the following minimum criteria developed by the committee:

(1) documentation of examination procedures sufficient to allow a basic replication;
(2) independently determined truth;
(3) inclusion of both guilty and innocent individuals as determined by truth criteria;
(4) sufficient information for quantitative estimation of accuracy;
(5) polygraph scoring conducted blind to information about truth; and,
(6) in experimental studies, appropriate assignment to experimental groups germane to estimating accuracy (mainly, guilt and innocence).

Id. While the NAS Report concluded that the polygraph studies that met the criteria for consideration “do not generally reach the high levels of research quality desired in science,” it nonetheless observed that “a sizable number of polygraph studies have ... appeared in good-quality, peer-reviewed journals.” Id. at 108. The NAS Report speculated that so many polygraph studies have appeared in high-quality journals because of “the practical importance of the topic and the willingness of journals to publish laboratory studies that are high in internal validity but relatively low in salience to real-world application.” Id.

{27} Furthermore, both Petitioners and the State submitted as exhibits a number of articles on the validity of the control question polygraph, some of which were published in peer-reviewed journals. While the State argues these articles are insufficient and cannot be relied upon to establish the validity of the control question polygraph, that is not our focus at this point in the Alberico lDaubert inquiry. We are only looking at whether the scientific technique has been subjected to peer review and publication, not the validity of the scientific research or the scientific community’s response to the research. While there has certainly been a heated debate in the scientific community on the validity and accuracy of the control question polygraph, that debate “is a question of weight and not of admissibility.” Anderson, 118 N.M. at 298, 881 P.2d at 43. The fact that an ongoing debate exists is all that is required for this factor to be deemed satisfied. Notwithstanding the NAS Report’s criticisms of the polygraph validation studies conducted, we conclude that the NAS Report sufficiently establishes that the polygraph has been subjected to peer review and publication. We now turn to the validity of the scientific research on the control question polygraph.

iii. Rate of error.

{28} The third factor of the Daubert/Alberico analysis requires us to examine the known or potential rate of error of the control question polygraph. Anderson, 118 N.M. at 291, 881 P.2d at 36. With regard to the rate of error of the control question polygraph, the district court concluded that “[t]he potential rate of error is vague and unreliable” and because the base rate is unknown “the reliability of test results as reflected in an actual percentage misrepresents the confidence level in the test.”

{29} As noted in the preceding section of this opinion, a number of polygraph validation studies have been conducted and subsequently published. A review of those studies revealed that the median accuracy index of the polygraph in laboratory studies is 0.86 with an interquartile range of 0.81 to 0.91. NAS Report, supra, at 122. The controlled question test specifically had a median accuracy index of 0.85, with an interquartile range from 0.83 to 0.90. Id. at 125. The field studies reviewed had a median accuracy index of 0.89, with a range from 0.711 to 0.999. Id. The interquartile range of accuracy indexes for all the studies, laboratory and field, was 0.81 to 0.91. Id. at 126. Based on the foregoing, the NAS Report concluded “the empirical data clearly indicate that for several populations of naive examinees not trained in countermeasures, polygraph tests for event-specific investigation detect deception at rates well above those expected from random guessing.” Id. at 149. The State argues the high accuracy rates derived from the studies are invalid for a number of reasons.

{30} Specifically, the NAS Report was concerned that the high accuracy rates for polygraph examinations in the studies may not correspond with what can be expected when the polygraph is used in real-life situations. The hypothesis underlying the control question polygraph technique is that physiological responses increase the more concerned the subjects are about being deceptive, which, if true, means “polygraph accuracy in laboratory models [might] be on average somewhat below true accuracy in field practice, where the stakes are higher.” Id. at 127. However, the NAS Report noted that “[t]here is a plausible contrary hypothesis ... in which examinees who fear being falsely accused have strong emotional responses that mimic those of the truly deceptive,” in which case “field conditions might have more false-positive errors than are observed in the laboratory and less accuracy.” Id. Furthermore, the NAS Report noted that “[s]ubstantial experience with clinical diagnostic and screening tests suggests that laboratory models, as well as observational field studies of the type found in the polygraph literature, are likely to overstate true polygraph accuracy.” Id. at 128.

{31} The NAS Report also identified several specific issues that may affect the accuracy of any polygraph examinations that have not been fully researched. First, while individual differences in physiological makeup, personality traits, and sociocultural group identity may affect the accuracy of the polygraph, the research on these individual differences is scant. See id. at 134-37. Second, while examiner expectancies of guilt may influence either the examiners’ judgments of the polygraph charts or the examinees’ physiological responses during the examination, “[the] evidence is too limited to draw any strong conclusions about whether examiners’ expectancies affect polygraph test accuracy.” Id. at 138. Third, “given the few studies performed, the few drugs tested, and the analogue nature of the evidence, a conclusion that drugs do not affect polygraph validity would be premature.” Id. at 139. Fourth, while some empirical research indicates mental and physical countermeasures can decrease the likelihood of a polygraph examination detecting deceptive examinees, id. at 143, the NAS Report noted the limitations of that research, id. at 143-44. The NAS Report specifically stated “we do not know of scientific studies examining the effectiveness of countermeasures in contexts where systematic efforts are made to detect and deter them.” Id. at 151.

{32} In Anderson, we considered the known or potential rate of error in the DNA profiling process at issue in that case. 118 N.M. at 298-99, 881 P.2d at 43-44. Similar to the State in this case, the defendant in Anderson argued that the accuracy rates of the DNA profiling process in that case were invalid for a number of reasons. While we noted that the deficiencies in calculating the rate of error was troubling, we stated the deficiencies in that case “[spoke] to the weight of the evidence and not to its admissibility.” Id. at 299, 881 P.2d at 44. In this case, we reach the same conclusion. Polygraph results are far from conclusive; however, as the NAS Report concluded, numerous studies have shown that polygraph tests can detect deception at rates well above chance. In fact, testimony at the evidentiary hearing indicates that the degree of accuracy of polygraph examinations is similar to many diagnostic techniques employed in the medical field, including magnetic resonance imaging (MRI), CAT scanning, ultrasound, and x-ray film. The opponent of polygraph evidence has ample opportunity through cross-examination and argumentation to cast doubt upon the results of any particular polygraph examination that have been admitted into evidence.

{33} The State nevertheless argues that the rate of error for polygraph evidence is unknown because the base rate is unknown. The district court found that the base rate, or ground truth, is “the proportion of people in a population as they relate to a particular trait in issue.” In the context of the polygraph, the base rate is generally the percentage of persons in a sample who are telling the truth. For example, if a polygraph study involved 100 subjects, and 85 of the subjects $ere actually telling the truth, the base rate would be 85%. The base rate does not measure the accuracy of the polygraph, which is the ability of the polygraph itself to correctly identify deceptive subjects and truthful subjects. The base rate is a measure only of the percentage of truthful subjects in the sample population. The true base rate is unknowable, but is theoretically important because it defines the degree of confidence properly afforded a particular polygraph result. Following are two examples used by the State to illustrate the point. In both examples the polygraph is assumed to be 90% accurate in detecting deception. Therefore, with a population of 100 subjects, the polygraph would correctly identify 90 of the subjects as either truthful or deceptive, while incorrectly identifying the remaining 10 subjects.

{34} In the first example, we assume a base rate of 50%, that is 50 of the 100 subjects are being truthful in their polygraph examination. Thus, with an accuracy rate of 90%, the polygraph will correctly identify 45 persons as deceptive and 45 persons as truthful, and it will incorrectly identify 5 persons as deceptive and 5 persons as truthful:

Not Deceptive Deceptive Pass 45 5
Fail 5 45

In the second example, we assume that only 10% of the 100 subjects are being truthful, while the remaining 90% are being deceptive. As a result, 81 of the 90 deceptive subjects will be accurately identified as deceptive and the remaining 9 will be incorrectly identified as truthful. Therefore, in this sample of 100 subjects, 9 truthful subjects will pass, but 9 deceptive subjects will also pass. Of the 18 subjects deemed to have passed the polygraph, there is only a 50% likelihood that any individual subject was actually truthful:

Not Deceptive Deceptive Pass 9 9
Fail 1 81

These examples illustrate the importance of the base rate: in a pool with a higher percentage of deceptive subjects, the likelihood that a passed polygraph indicates actual truthfulness decreases. Specifically, in the first example a passed polygraph examination is 90% likely to be correct; whereas, in the second example, a passed polygraph is only 50% likely to be correct.

{35} We cannot determine the base rate in the context of the polygraph because we cannot determine in advance how many persons are telling the truth and how many are not. However, the base rate has no effect on the reliability of the polygraph — regardless of whether 50% or 90% of the sample population is deceptive, the accuracy of the polygraph remains unchanged. The base rate only affects the confidence that we have in making decisions based on the results of any one polygraph examination. The accuracy of the polygraph in both of the above examples was the same, but in the second example we would have less confidence than in the first example that a passed polygraph examination was correct. Nonetheless, even in the second example, evidence that a subject passed a polygraph examination has a tendency to make the existence of a fact more or less probable than it would be in the absence of the evidence. Prior to the subject passing the polygraph examination, we would have assumed only a 10% chance that subject was truthful. After passing the examination, though, the likelihood the subject was truthful has increased to 50%. Therefore, the fact that the base rate is unknowable does not preclude admissibility under Rule 11-702. It simply provides another basis for the opposing party to cast doubt upon the results of a particular polygraph examination through cross-examination and argumentation. We now turn to whether standards exist controlling the polygraph.

iv. Maintenance of standards controlling the technique.

{36} Additionally, we examine “the existence and maintenance of standards controlling the technique’s operation.” Daubert, 509 U.S. at 594, 113 S.Ct. 2786. The district court found that “[t]here are no set standards [for the administration of the control question polygraph] other than those set out in Rule 11-707,” which the court concluded were insufficient.

{37} In this state, it is unlawful to “practice polygraphy for any remuneration without a license issued by the [regulation and licensing] department in accordance with the Private Investigators and Polygraphers Act.” NMSA 1978, § 61-27A-3(E) (1993). To qualify for a license to practice polygraphy, a person must meet the requirements of NMSA 1978, § 61-27A-6(G) (1993), which states:

G. The department shall issue a license for polygrapher to a person who files a completed application accompanied by the required fees and who submits satisfactory evidence that the applicant:
(1) is at least eighteen years of age;
(2) possesses a high school diploma or its equivalent;
(3) has not been convicted of a felony or misdemeanor involving moral turpitude; and
(4) has graduated from a polygraph examiners course approved by the department and:
(a) has completed a probationary operational competency period and passed an examination of ability to practice polygraphy; or
(b) has submitted proof of holding, for a minimum of two years immediately prior to the date of application, a current license to practice polygraphy in another jurisdiction whose standards equal or surpass those of New Mexico.

{38} Furthermore, Rule 11-707(B) imposes additional restrictions on who can testify as an expert witness regarding polygraph results. A polygraph expert must have “at least five (5) years’ experience in administration or interpretation of polygraph examinations or equivalent academic training.” Rule 11-707(B)(1). Also, the polygraph expert must have “successfully completed at least twenty (20) hours of continuing education in the field of polygraph examinations during the twelve (12) month period immediately prior to the date of the examination.” Rule 11 — 707(B)(3). Between the restrictions governing who can perform polygraph examinations in this state and those governing who can testify regarding polygraph results, sufficient standards are in place controlling the polygraph examiner.

{39} Also, Rules 11-707(0) and (E) contain a number of prerequisites to the admission of polygraph results:

C. Admissibility of results. Subject to the provisions of these rules, the opinion of a polygraph examiner may in the discretion of the trial judge be admitted as evidence as to the truthfulness of any person called as a witness if the examination was performed by a person who is qualified as an expert polygraph examiner pursuant to the provisions of this rule and if:
(1) the polygraph examination was conducted in accordance with the provisions of this rule;
(2) the polygraph examination was quantitatively scored in a manner that is generally accepted as reliable by polygraph experts;
(3) prior to conducting the polygraph examination the polygraph examiner was informed as to the examinee’s background, health, education and other relevant information;
(4) at least two (2) relevant questions were asked during the examination; and
(5) at least three (3) charts were taken of the examinee. ...
E. Recording of tests. The pretest interview and actual testing shall be recorded in full on an audio or video recording device.

It has been noted by one commentator that “[i]n the treatment of the technical aspects of polygraph examination protocol, [Rule 11-707] goes far beyond the ease law or statutes of any other jurisdiction in providing usable standards.” James R. McCall, Misconceptions and Reevaluation — Polygraph Admissibility After Rock and Daubert, 1996 U. Ill. L.Rev. 363, 388 (1996).

{40} The American Polygraph Association (APA), the leading polygraph professional association, has developed protocol standards for the polygraph similar to those contained in Rule 11-707. See American Polygraph Association, Division III: APA Standards of Practice (Jan. 10, 1999), available at http://www.polygraph.org/standards.htm. Under these standards, prior to examination, the polygraph examiner must make a reasonable effort to determine whether an examinee is fit for polygraph testing by inquiring into the medical and psychological condition of the examinee, as well as any recent drug use by the examinee, APA Standard 3.4.1; the polygraph instruments must be APA approved and have been calibrated, APA Standard 3.5; and a pretest interview must be conducted where the examiner both discusses with the examinee the polygraph process and the issues to be tested and ensures that the examinee recognizes and understands each question, APA Standard 3.8. During the examination, the questions used must be clear and distinct, APA Standard 3.9.3; the questions used must be balanced in terms of length and impact, APA Standard 3.9.4; the examiner must collect a sufficient number of charts, APA Standard 3.9.5; standardized chart markings should be used, APA Standard 3.9.7; and either an audio or audio/video recording of the pretest and in-test phase of the examination must be made, APA Standard 3.9.8. As for scoring the chart, the examiner must use numerical scoring, APA Standard 3.10.1; and the examiner’s notes must have “sufficient clarity and precision so that another examiner could read them,” APA Standard 3.10.2.

{41} Based on the foregoing, we conclude sufficient standards are in place governing the control question polygraph technique, so as to allow expert testimony on the subject to be admissible. In order for polygraph expert evidence to be admissible under Rule 11-707, the polygraph examination must be conducted in a particular manner by a qualified examiner. Furthermore, as previously explained, the APA has established even more detailed standards of practice in order to ensure the utmost degree of accuracy in detecting truthfulness or deception with the polygraph.

v. Acceptance by relevant scientific community.

{42} Finally, while “general acceptance is not a requirement for admissibility under [Rule 11-702], it is a factor the court may consider.” Anderson, 118 N.M. at 299, 881 P.2d at 44. As the United States Supreme Court noted in Daubert, “a known technique which has been able to attract only minimal support within the community may properly be viewed with skepticism.” 509 U.S. at 594, 113 S.Ct. 2786 (quotation marks and quoted authority omitted). In this case, the district court concluded that “[c]ontrol question polygraph tests are not accepted in the relevant scientific community at a significant level, particularly considering the age of the technique.”

{43} In arguing whether the control question polygraph has been generally accepted by the relevant scientific community, the parties have identified four surveys of psychologists’ opinions regarding polygraph examinations, including: The Gallup Organization, Survey of Members of the Society for Psychological Research Concerning Their Opinion of Polygraph Test Interpretation, 13 Polygraph 153 (1984) [hereinafter Gallup Survey]; Susan L. Amato, A Survey of Members of The Society for Psychophysiologieal Research Regarding the Polygraph: Opinions and Implications (1993) (unpublished Master’s thesis, University of North Dakota) (on file with the University of North Dakota Library) [hereinafter Amato Survey]; W.G. Iacono & D.T. Lykken, The Validity of the Lie Detector: Two Surveys of Scientific Opinion, 82 J. of Applied Psychol. 426 (1997) [hereinafter Iacono Survey]; and Honts et al., General Acceptance of the Polygraph by the Scientific Community (Mar. 9, 2002) (unpublished paper presented at the meetings of the American Psychology Law Society, on file with author) [hereinafter Honts Survey]. Of these four surveys, the district court found the Iacono survey to be the most reliable, and relied exclusively on that survey in drawing its conclusion that control question polygraph examinations do not enjoy general acceptance within the scientific community.

{44} In the Gallup Survey, conducted in 1982, a random sample of 155 members of the Society for Psyehophysiological Research were interviewed regarding their opinion of the use of polygraph testing procedures to detect deception. Gallup Survey, supra, at 154. When asked their opinion of polygraph tests for interpreting whether a subject is or is not telling the truth, 61% of the respondents agreed that the polygraph is a useful diagnostic tool when considered with other available information. Id. at 157. An additional 32% agreed that the polygraph is of questionable usage and is entitled to little weight against other information. Id. Only 3% believed that the polygraph is of no usefulness. Id. In 1993, Amato replicated the Gallup Survey in an effort to determine if there were any changes in the scientific community’s opinions on the validity of the polygraph in the preceding ten years. Amato Survey, supra, at 1. The Amato Survey received 136 total responses, for a response rate of approximately 30%. Id. at 2. This time, when asked the same question as in the Gallup Survey, 60% of the respondents agreed that the polygraph is a useful diagnostic tool, 37% agreed it is of questionable usage, and 2% believed it was of no usefulness. Id. at 3.

{45} In 1997, two groups of scientists were surveyed in an attempt to “more thoroughly assess current scientific opinion about polygraphy.” Iacono Survey, supra, at 427. The first group surveyed by Iacono was the same one used in both the Gallup Survey and the Amato Survey — the Society of Psycho-physiological Research. Id. at 428. Questionnaires were sent to 216 society members, and 195 members responded. Id. at 429. Of those who responded and had an opinion on the polygraph, only 36% believed that the control question technique is “based on scientifically sound psychological principles or theory”; whereas, 77% believed the guilty knowledge test is based on sound psychological principles. Id. at 430. The second group surveyed was the Fellows of Division 1 (General Psychology) of the American Psychological Association. Id. at 428. Questionnaires were mailed to 249 APA Fellows, and 168 usable questionnaires were returned. Id. at 429. In this group, only 30% believed the control question technique is based on sound psychological principles and 72% believed the same of the guilty knowledge test. Id. at 430.

{46} Finally, in 2002, a paper was presented at the meetings of the American Psychology Law Society (APLS) that was based on two surveys: one of the APLS and one of the SPR. Honts Survey, supra, at 1, 8. Only 55 out of 205 APLS members responded, and 38 out of 366 SPR members responded. Id. at 8. Of those who responded, 96% of the APLS members and 91% of the SPR members believed that polygraph studies published in scientific peer-reviewed journals are “based on generally accepted scientific methodology.” Id. at 14. When asked to compare the usefulness of the polygraph to other specific examples of commonly admitted evidence, more than half of the respondents believed that polygraph evidence is as useful or more useful than a psychologist’s opinion of parental fitness, a psychologist’s opinion regarding malingering, an eyewitness identification of a robbery suspect, a psychological assessment of dangerousness, and a psychological assessment of temporary insanity. Id. at 15. Finally, slightly more than half of the APLS respondents and slightly less than half of the SPR respondents believed that the accuracy of judicial verdicts would be increased if polygraph test results were admitted as evidence at trial. Id. at 16.

{47} As noted earlier in this opinion, see supra ¶27, there is a heated debate in the scientific community on the validity of the control question polygraph examination. This debate is reflected by the competing surveys cited above. The Iacono Survey was conducted by Dr. William Iacono, Professor of Psychology at the University of Minnesota, who testified on behalf of the State at the evidentiary hearing below. The Amato Study was a Master’s thesis conducted under the guidance of Dr. Charles Honts, Professor of Psychology at Boise State. Dr. Honts also was the lead scientist of the Honts Study. He testified on behalf of the Respondents at the hearing below. The hearing below was not the first time that Dr. Iacono and Dr. Honts have been on opposing sides in the debate over the admissibility of polygraph examination results. Compare David C. Raskin, Charles R. Honts & John C. Kireher, The Scientific Status of Research on Polygraph Techniques: The Case for Polygraph Tests, in 1 Modem Scientific Evidence: The Law and Science of Expert Testimony § 14-2.0 (David L. Faigman et al. eds., 1997); with William G. Iacono & David T. Lykken, The Scientific Status of Research on Polygraph Techniques: The Case Against Polygraph Tests, in 1 Modem Scientific Evidence, supra, § 14-3.0. Based on the foregoing, we cannot conclude that the control question polygraph has been generally accepted within the scientific community. However, we also cannot conclude that the control question polygraph has been uniformly rejected by the scientific community. This factor thus carries little weight in our Alberico /Daubert analysis of the control question polygraph.

IV. CONCLUSION.

{48} Based on the foregoing, we conclude that the control question polygraph examination is sufficiently reliable to satisfy Rule 11-702. In so holding, we are cognizant of a number of potential problems with polygraph results, such as the use of physical and mental countermeasures to “beat the polygraph” and the influence on results of examiner expectancies. The district court was correct to be concerned by these problems; however, as we noted earlier in the opinion, any doubt about the admissibility of scientific evidence should be resolved in favor of admission. See supra ¶ 16. The remedy for the opponent of polygraph evidence is not exclusion; the remedy is cross-examination, presentation of rebuttal evidence, and argumentation. See Daubert, 509 U.S. at 596,113 S.Ct. 2786 (“Vigorous cross-examination, presentation of contrary evidence, and careful instruction on the burden of proof are the traditional and appropriate means of attacking shaky but admissible evidence.”).

{49} Our reaffirmation of Rule 11-707 is also based, at least in part, on principles of fairness. Often the same government officials who vigorously oppose the admission of exculpatory polygraphs of the accused find polygraph testing to be reliable enough to use in their own decision-making. Federal and state governments rely upon the results of polygraph examinations for a variety of law enforcement purposes, even in jurisdictions where polygraph evidence is inadmissible. For example, the polygraph is used to determine whether there is probable cause to arrest and whether to prosecute. See Johnson v. Schneiderheinz, 102 F.3d 340, 342 (8th Cir.1996) (holding that a police officer reasonably relied upon polygraph results, among other factors, in making his decision to arrest); Brodnicki v. City of Omaha, 75 F.3d 1261, 1267 (8th Cir.1996) (stating that the county attorney was under a “duty” to review the polygraph evidence in that ease “as part of his role as advocate for the state”); Bennett v. City of Grand Prairie, 883 F.2d 400, 405-06 (5th Cir.1989) (holding that a magistrate judge may consider polygraph results when determining whether probable cause exists to issue an arrest warrant). Polygraphs have also been employed to make various disciplinary and sentencing decisions. See Lenea v. Lane, 882 F.2d 1171, 1174 (7th Cir.1989) (holding that polygraph results are admissible in prison disciplinary proceedings); United States v. Chaney, 1996 WL 187515, *1 (10th Cir.) (holding that the district court may use a defendant’s polygraph examination to determine the amount of restitution in an embezzlement case). Most jurisdictions also approve of requiring polygraph examinations as a condition of probation. See Anne M. Payne, Annotation, Propriety of Conditioning Probation on Defendant’s Submission to Polygraph or Other Lie Detector Testing, 86 A.L.R.4th 709 (1991).

{50} In short, we believe a categorical exclusion of polygraph results would be unwise. See United States v. Scheffer, 523 U.S. 303, 318, 118 S.Ct. 1261, 140 L.Ed.2d 413 (1998) (Kennedy, J., concurring in part and concurring in judgment) (doubting the wisdom of a per se exclusion of polygraph evidence). Therefore, we refuse to repeal Rule 11-707; instead, we order the district courts in the pending cases to comply with Rule 11-707 in determining whether to admit polygraph examination results. The proponents of such polygraph evidence are not required to independently establish the reliability of the examiner’s testimony under Rule 11-702.

{51} IT IS SO ORDERED.

PETRA JIMENEZ MAES, Chief Justice, PATRICIO M. SERNA, RICHARD C. BOSSON and EDWARD L. CHAVEZ, Justices, concur.

APPENDIX

SECOND JUDICIAL DISTRICT COURT

COUNTY OF BERNALILLO

STATE OF NEW MEXICO

KEVIN LEE, et al,

Petitioners,

-vs-

HON. LOURDES MARTINEZ,

Respondents.

No. CS 2003-00026

(Supreme Court No. 27,915)

Findings of Fact and Conclusions of Law

Introduction:

Pursuant to Supreme Court Order issued in this matter, this Court is directed to enter findings of fact and conclusions of law. Given the tremendous volume of information presented by the parties as well as the testimony of several of the leading authorities on the issues decided, the Court has taken upon itself to provide an introductory section that includes an overview of the status of the law on polygraph examinations nationwide in both state and federal courts and a description of the polygraph examination process with the hope that it will assist the reviewing court. The findings of fact and conclusions of law follow these sections.

APPENDIX — Continued

While many of the materials presented by both sides are worthy of note, a recent publication, The Polygraph and Lie Detection (PALD), a 2003 publication of the National Academy of Sciences (NAS), is particularly helpful. PALD focuses on the use of the polygraph in relation to employee screening. But since most of the research is in the area of event-specific investigations, its analysis of that research is highly useful in this context as well.

Another highly useful source is Faigman, The Law and Science of Expert Testimony (2002), or “Faigman”. In Volume 2, § 19-2.0 is an article titled, The Scientific Status of Research on Polygraph Techniques: The Case for Polygraph Tests, by Honts, Raskin, and Kircher. Later, § 19-3.0, is an article titled, The Scientific Status of Research on Polygraph Techniques: The Case Against Polygraph Tests, by Iacono and Lykken.

The Court recommends the two sources listed above for excellent overviews of some of the issues. In addition to the above, the parties to this action provided many exhibits, articles on nearly every aspect of polygraph examinations, studies relating to polygraph examinations, transcripts of testimony, and easelaw.

Without trying to oversimplify the issues presented, in evaluating the standards adopted in State v. Alberico, 116 N.M. 156, 861 P.2d 192 (1993), and restated in State v. Anderson, 118 N.M. 284, 881 P.2d 29 (1994), the testimony and arguments tended to gravitate to a number of key issues:

First, whether there is a theory and whether it can be and has been tested. This includes the effect of base rates in determining reliability of test results in assisting the trier of fact and determining the balance between the probative value and prejudicial effect of the testimony;
Second, whether the theory or technique has been subjected to peer review and publication;
Third, whether there is a known potential rate of error in using polygraph techniques as well as whether there are standards that exist and are maintained that control the technique’s operations;
Fourth, acceptance of the test in the relevant scientific community; and,
Fifth, whether the technique is based upon well-recognized scientific principles and whether it is capable of supporting opinions based upon reasonable probability rather than conjecture.

To the extent possible, the findings of fact will be set out in sections that will address each of these factors.

POLYGRAPH EXAMINATION PROCEDURES

A polygraph examination combines interrogation with physiological measurements made by the instrument, or polygraph. The instrument typically measures and records an examinee’s heart rate, blood pressure, rate and depth of respiration and flow of electrical current at the skin surface as an examiner poses questions that require yes or no answers. Blood pressure is measured by a cuff over the biceps. Electrodermal activity (activity of the eccrine sweat glands) is measured by electrodes on the palm or on two fingers. Rate and depth of breathing are measured by pneumographs located on the chest and abdomen. Fluctuations in the heart and blood are recorded by a eardiosphygmograph, while a galvanometer records the body’s electrical activity.

The sensors attached to the examinee are connected to the instrument by wires. The data is recorded by analog or digital technology. Because the first analog instruments recorded the data with several pens writing lines on a piece of moving paper, the record of the examinee’s physiological responses is known as the polygraph chart.

The instrument does not measure or detect lies directly. Instead, proponents believe it measures physiological responses that are stronger when an examinee lies than at other times. A lie in response to a question may cause a reaction such as fear of detection or psychological arousal that changes heart rate, blood pressure, breathing rate, or skin conductance relative to what they were before the question was asked and relative to what they are after control questions are asked.

Polygraph testing is used for three main purposes: 1. Screening of job applicants by law enforcement or other government agencies (preemployment screening); 2. Screening by agencies involved in national security of current employees; and 3. Investigating specific incidents, as in criminal cases. When police conduct a polygraph test of a suspect, it is considered to be under adversarial conditions. In contrast, when defense counsel asks a client to take a privately administered test, it is called a “friendly” test. If the client passes the friendly test, defense counsel will often attempt to enter the results into evidence, and this is the more typical background for an evidentiary hearing like the present one.

There are three major questioning techniques used in polygraph examinations: the relevant-irrelevant test (RIT), the guilty knowledge test (GKT), and the control question or comparison question test (CQT). The CQT’s “are the most widely used techniques in criminal investigations and judicial proceedings.” Because the CQT is the most used test in criminal cases and because the tests in the instant cases were apparently CQT’s, this Court’s analysis will focus on that technique. Under Rule 11-707 NMRA 2003, tests using any of the three techniques would be admissible if that Rule’s criteria were met.

The CQT tries to determine if the examinee is lying in response to a specific question or questions about the incident at issue (relevant questions). This involves comparing physiological responses to the relevant questions with physiological responses to control questions. Because the cuff on the arm begins to hurt after several minutes, a limited number of questions, about ten, are asked to complete one chart. Rule 11-707 requires that an examination include at least three charts.

Prior to the actual CQT, there is a pre-test interview. The examiner and examinee discuss the test, test procedure, examinee’s medical history, and details of the test issues. Both relevant and control questions are reviewed, to minimize surprise and to ensure the examinee understands the questions. This portion of the examination may last from 30 minutes to 2 hours or more. The expectation is that innocent examinees will react more strongly to control questions than to relevant questions, and guilty examinees will react more strongly to relevant questions. For example, a relevant question might be, “Did you rob the First City Bank?” Control questions are vague, cover a long period of time, and describe acts that most people have committed but are reluctant or embarrassed to admit during a polygraph exam. That is, if the examinee were suspected of theft, a control question could be, “During the first 22 years of your life did you ever take something that did not belong to you?”

Innocent people answer the relevant questions truthfully, but are expected to lie or be uncertain about their truthfulness when answering the control questions. That is, in these “probable-lie” control question tests, the instructions are designed to induce innocent people to answer “no” to control questions, even though most would then be lying. In contrast, guilty people are expected to be more concerned about failing the test because their answers to the relevant questions are lies, and they are likely to be more disturbed by the relevant questions, or so the reasoning behind CQT goes. Thus, the “art of the polygrapher lies in composing control and relevant questions that elicit the appropriate relative responses from truthful and deceitful parties.” See State v. Porter, [241 Conn. 57] 698 A.2d 739, 762 (Conn.l997)(assuming without deciding that polygraph evidence met Daubert criteria but upholding per se rule barring its admissibility because prejudice outweighed probative value).

In another version of the CQT, the “directed-lie” test, examinees are instructed to lie to control questions such as, “Before 2002, did you ever make even one mistake?” The examiner tells the examinee that these questions will ensure that the examinee will be correctly classified as truthful or deceptive on the polygraph test to follow. Where the polygrapher in the probable-lie test chooses control questions during the pre-test interview to suit each examinee, the direeted-lie control questions are a small set of simple questions that are “much easier to standardize.”

After the test, the charts are scored by a polygrapher or by a computer. Each relevant question response is measured against an adjacent control question response. Scores for each comparison range from +3 to —3. When the response to the control question is much stronger than to the relevant question, it is scored +3, indicating truthfulness. A score of —3 indicates a much stronger response to the relevant question relative to the response to the control question, indicating deception. If the two responses are about the same, the score is 0, with scores of ± 1 and ± 2 for intermediate values. The scores for all three charts are totaled. Examinees with scores of +6 or greater are considered truthful; those with scores of -6 or lower are deemed to be lying. Scores between +5 and -5 are inconclusive. The total score may range from approximately +30 to — 30. But see United States v. Galbreth, 908 F.Supp. 877, 894 (D.N.M.1995), where the leading proponent of polygraph evidence, Dr. David Raskin, scored the defendant’s charts as +32. Charts may also be scored by computers using standardized algorithms, a relatively recent development.

ADMISSIBILITY OF POLYGRAPH EVIDENCE IN OTHER STATE COURTS

Eighty years ago, polygraph evidence was held inadmissible because it was not “sufficiently established to have gained general acceptance in the particular field in which it belongs.” See Frye v. United States, 293 F. 1013 (D.C.Cir.1923). The standards for the admission of scientific evidence were changed by Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), and many states, including New Mexico, adopted those standards. See State v. Alberico, 116 N.M. 156, 861 P.2d 192 (1993). Consequently, supporters of polygraph evidence sought its admission under the new standards. They have had little success before courts that have maintained pre-Daubert standards or courts that have adopted Daubert.

Twenty-seven (27) states and the District of Columbia apply a per se rule of exclusion of polygraph evidence for all purposes. See Pulakis v. State, 476 P.2d 474 (Alaska 1970); People v. Anderson, 637 P.2d 354 (Colo.1981) (applying Frye, which Colorado abandoned in People v. Shreck, 22 P.3d 68 (Colo.2001)); State v. Porter, [241 Conn. 57] 698 A.2d 739 (Conn.1997); State v. Okumura, [78 Hawai'i 383] 894 P.2d 80 (Haw.1995); People v. Sanchez, [169 Ill.2d 472, 215 Ill.Dec. 59] 662 N.E.2d 1199 (Ill.1996); Morton v. Commonwealth, 817 S.W.[2d]218 (Ky.1991); State v. Harnish, 560 A.2d 5 (Me.1989); State v. Hawkins, [326 Md. 270] 604 A.2d 489 (Md. 1992); Commonwealth v. Mendes, [406 Mass. 201] 547 N.E.2d 35 (Mass.1989); State v. Anderson, 379 N.W.2d 70 (Minn.1985); Weatherspoon v. State, 732 So.2d 158 (Miss. 1999); State v. Hall, 955 S.W.2d [198] (Mo. 1997); State v. Staat, [248 Mont. 291] 811 P.2d 1261 (Mont.1991); State v. Steinmark, [195 Neb. 545] 239 N.W.2d 495 (Neb.1976); State v. Ober, [126 N.H. 471] 493 A.2d 493 (N.H.1985); People v. Angelo, [88 N.Y.2d 217, 644 N.Y.S.2d 460] 666 N.E.2d 1333 (N.Y. 1996); State v. Grier, [307 N.C. 628] 300 S.E.2d 351 (N.C.1983); Fulton v. State, 541 P.2d 871 (Okla.Crim.App.1975); State v. Brown, [297 Or. 404] 687 P.2d 751 (Or.1984); Commonwealth v. Brockington, [500 Pa. 216] 455 A.2d 627 (Pa.1983); In Re Odell, 672 A.2d 457 (R.I.1996); State v. Hart, 911 S.W.2d 371 (Tenn.Crim.App.1995); Tennard v. State, 802 S.W.2d 678 (Tex.Crim.App. 1990); State v. Hamlin, [146 Vt. 97] 499 A.2d 45 (Vt.1985); Robinson v. Commonwealth, [231 Va. 142] 341 S.E.2d 159 (Va.1986); State v. Beard, [194 W.Va. 740] 461 S.E.2d 486 (W.Va.1995); State v. Dean, [103 Wis.2d 228] 307 N.W.[2d]628 (Wis.1981), declined to follow on other grounds by State v. Davis, [254 Wis.2d 1] 645 N.W.2d 913 (Wis.2002); Contee v. United States, 667 A.2d 103 (D.C.1995).

These per se states ban polygraph evidence, including test results, offers to take the test, as well as refusals to take the test, for a variety of reasons. These courts found that the polygraph has not been proven valid or reliable or that it has not been generally accepted in the scientific community. But a more salient reason for the outright ban is that the prejudice in a jury trial outweighs the probative value of corroborating a witness’s credibility. See State v. Porter, [241 Conn. 57] 698 A.2d 739 (Conn.l997)(“State appellate courts, for whom Daubert is not mandatory authority, largely agree with our assessment that the prejudicial impact of polygraph evidence outweighs its probative value.”) Id. at 773.

Four of the above states (Massachusetts, North Carolina, OWahoma, and Wisconsin) had admitted polygraph evidence for years, but have since returned to a per se ban. See Commonwealth v. Mendes, [406 Mass. 201] 547 N.E.2d 35, 41 (Mass.1989)(citing inter alia dangers of confusing jury and usurping jury’s role and the “overwhelming authority throughout country”) and State v. Dean, [103 Wis.2d 228] 307 N.W.2d 628, 653 (Wis.1981)(“ Adequate standards have not developed in the seven years since [the decision to admit polygraph evidence on stipulation] to guide the trial courts in exercising their discretion in the admission of polygraph evidence. The lack of such standards heightens our concern that the burden on the trial court to assess the reliability of stipulated polygraph evidence may outweigh any probative value the evidence may have.”)

Seventeen (17) states admit polygraph evidence at trial only when its admission is stipulated to in advance by all parties. See Ex Parte Hinton, 548 So.2d 562 (Ala.1989); State v. Valdez, [91 Ariz. 274] 371 P.2d 894 (Ariz.1962); Holcomb v. State, [268 Ark. 138] 594 S.W.2d 22 (Ark.1980); People v. Fudge, [7 Cal.4th 1075, 31 Cal.Rptr.2d 321] 875 P.2d 36 (Cal.1994); Melvin v. State, 606 A.2d 69 (Del.1992); Delap v. State, 440 So.2d 1242 (Fla.1983); Fargason v. State, [266 Ga. 463] 467 S.E.2d [551] 553 (Ga.1996); State v. Fain, [116 Idaho 82] 774 P.2d 252 (Idaho 1989); Sanchez v. State, 675 N.E.2d 306 (Ind. 1996); State v. Losee, 354 N.W.2d 239 (Iowa 1984); State v. Webber, [260 Kan. 263] 918 P.2d 609 (Kan.1996); Corbett v. State, [94 Nev. 643] 584 P.2d 704 (Nev.1978); State v. McDavitt, [62 N.J. 36] 297 A.2d 849 (N.J. 1972); State v. Stevenson, 652 N.W.2d 735 (S.D.2002); State v. Crosby, 927 P.2d 638 (Utah 1996); State v. Renfro, [96 Wash.2d 902] 639 P.2d 737 (Wash.1982); Schmunk v. State, 714 P.2d 724 (Wyo.1986).

In these states, stipulation usually means both parties agree prior to a subject taking a test that the results will be admissible and that the adversely affected party retains the right to cross-examine the polygraph examiner and otherwise to attempt to impeach the polygraph evidence. See, e.g., State v. Validez, [91 Ariz. 274] 371 P.2d 894 (Ariz.1962). Generally, these appellate decisions do not claim that the evidence is probative or becomes reliable due to the stipulation. See Delap v. State, 440 So.2d 1242, 1247 (Fla. 1983). Some courts, however, have concluded that the stipulation makes the test reliable — it raises the examinee’s fear and leads to the selection of more impartial examiners, tending to produce more accurate results.

Two (2) other states admit stipulated results but in limited circumstances. See State v. Yodsnukis, 281 N.W.2d 255 (N.D. 1979)(post-trial proceedings) and State v. Souel, [53 Ohio St.2d 123] 372 N.E.2d [1318] 1313 (Ohio 1978)(for corroboration or impeachment only).

Louisiana and Michigan allow the admission of polygraph evidence without stipulation but only in post-trial proceedings. See State v. Catanese, 368 So.2d 975 (La.1979) and People v. Barbara, [400 Mich. 352] 255 N.W.2d 171 (Mich.1977).

South Carolina generally bars admission of polygraph evidence, but the decision is now left to the discretion of the trial judge after a hearing applying Rules of Evidence 702 and 403. See State v. Council, [335 S.C. 1] 515 S.E.2d 508 (S.C.1999).

ADMISSIBILITY OF POLYGRAPH EVIDENCE IN FEDERAL COURTS

United States v. Scheffer, 523 U.S. 303 [118 S.Ct. 1261, 140 L.Ed.2d 413] (1998) held that military courts’ per se rule excluding polygraph evidence did not violate a defendant’s right under the Fifth or Sixth Amendment to present a defense. Beyond this holding, the decision lacks precedential value, given the fractured makeup of the Court’s three opinions.

In contrast to the majority of state courts, only two federal circuits have a per se rule barring admissibility. See United States v. Prince-Oyibo, 320 F.3d 494 (4th Cir.2003), Petition for Certiorari Filed, (July 11, 2003)(NO. 03-5297) and United States v. Skeens, 494 F.2d 1050 (D.C.Cir.l974)(citing the Circuit’s decision in Frye v. United States, 293 F. 1013 (D.C.Cir.1923)).

Most federal appellate courts leave admission of polygraph evidence to the discretion of the trial courts, but generally such evidence is excluded on the basis of Daubert /Rule 702 or Rule 403 or both. See United States v. Black, 78 F.3d 1, 7 (1st Cir.l996)(generally inadmissible); United States v. Santiago-Gonmlez, 66 F.3d 3, 6 (1st Cir.1995)(admissible if agreed to in plea bargain); United States v. Kwong, 69 F.3d 663, 668 (2nd Cir.l995)(balancing test under Rule 403); United States v. Lee, 315 F.3d 206, 214 (3rd Cir.2003)(noting lack of per se exclusionary rule and admissibility to rebut claim of coerced confession but declining to rule on admissibility at trial or revocation hearing), Petition for Certiorari Filed, (June 2, 2003)(NO. 02-11166); United States v. Posado, 57 F.3d 428, 434 (5th Cir.l995)(must meet Rule 702 and Rule 403 standards); United States v. Sherlin, 67 F.3d 1208,1216— 17 (6th Cir.l995)(Rule 403 standard, but results generally inadmissible, especially if unstipulated); United States v. Lea, 249 F.3d 632, 640 (7th Cir.2001) (“[W]e continue to hold that a district court need not conduct a full Daubert analysis in order to determine the admissibility of standard polygraph evidence, and instead may examine the evidence under a Rule 403 framework. Nonetheless, we posit that the factors outlined by the Supreme Court in Daubert remain a useful tool for gauging the reliability of the proffered testimony, as reliability may factor into a 403 balancing test.”).

See also United States v. Williams, 95 F.3d 723, 729-30 (8th Cir.l996)(suggesting non-stipulated evidence may be admissible under Daubert if Rule 403 is met) and United States v. Waters, 194 F.3d 926 (8th Cir.1999)(Daubert hearing unnecessary where 403 not met despite defendant passing test requested and given by prosecution); United States v. Cordoba, 194 F.3d 1053 (9th Cir.l999)(must meet 702 and 403); United States v. Call, 129 F.3d 1402 (10th Cir.1997)(evidence properly excluded under 403 where requested Daubert hearing not held); United States v. Gilliard, 133 F.3d 809 (11th Cir.1998)(Honts-administered polygraph inadmissible under 702 and under 403).

“Leaving discretion to trial courts rather than prescribing a per se rule does not seem to have changed practice substantially.” That is, “even when presented with an opportunity to admit polygraph evidence, most [federal] district courts are decidedly reluctant to do so.” See State v. Porter, [241 Conn. 57] 698 A.2d 739, 776-77 (Conn.1997).

One rare case admitting polygraph evidence was United States v. Galbreth, 908 F.Supp. 877 (D.N.M.1995). In Galbreth, Judge Vasquez admitted the expert opinion testimony of Dr. Raskin, the nation’s leading supporter of the validity of polygraph evidence, after finding it met the reliability criteria of Rule 702 and Daubert as well as being more probative than prejudicial under Rule 403. Dr. Raskin had given Galbreth a polygraph test, which the court described as “a properly conducted examination by a highly qualified, experienced, and skillful examiner.” Id. at 896. However, this ruling carries little weight due to its procedural placement.

The judge ruled from the bench after a hearing in March, 1995. In July, 1995, the case went to trial. At the conclusion of the Government’s case-in-chief, the Government dismissed the charges (income tax evasion). Galbreth’s polygraph evidence was never presented to the jury. On October 4, 1995, the judge issued a “Memorandum Opinion and Order” that detailed her ruling on the admission of the polygraph evidence. The Order was therefore unappealable and dicta.

State v. Porter, [241 Conn. 57] 698 A.2d 739, 777, n. 76 (Conn.1997) described Galbreth this way:

The most substantial of the few federal opinions permitting polygraph evidence at trial comes from the District Court of New Mexico. United States v. Galbreth, supra, 908 F.Supp. 877. The Tenth Circuit Court of Appeals had only addressed the question of polygraph admissibility before Daubert had been released; see United States v. Soundingsides, 820 F.2d 1232, 1241-42 (10th Cir.1987); so the court in Galbreth felt free to formulate its own standard. The court accepted that Daubert provided the proper threshold standard; id., at 878; and then relied largely on testimony by Raskin to conclude that polygraph evidence satisfied Daubert and rule 403 of the Federal Rules of Evidence. Id., at 895. Although the court in Galbreth did address many of the concerns that have motivated us to retain our per se rule of exclusion, it did so by recounting only the most propolygraph studies and information. Id., at 885-93. We believe that a more balanced review of the polygraph literature, such as we have conducted in the present case, reveals substantially more uncertainty regarding the effectiveness and prejudicial impact of the polygraph test than the court in Galbreth acknowledged.

Dr. Raskin scored the test as +29, and Dr. Honts scored it as +32, indicating a high probability of truthfulness. The Government’s expert, Dr. Barland, found the charts to be inconclusive. Galbreth at 894.

A critical issue was whether Galbreth knowingly failed to report income. Had Dr. Raskin testified, he would have been permitted to state that Galbreth’s “answers to the relevant questions regarding his knowledge and intent [were] consistent with a truthful polygraph outcome.” Id. at 895. (Emphasis added.) As the judge put it:

Dr. Raskin concluded that Defendant was truthful in his statements that he did not realize his returns under reported his taxable income. At trial, Defendant intends to call Dr. Raskin as an expert witness to testify about the testing procedures, to explain how the test was evaluated and to explain his interpretation of the results. Dr. Raskin is expected to testify that the results are indicative of a truthful polygraph test outcome with regard to the relevant questions. Dr. Raskin will not testify as to his personal opinion that Defendant was in fact telling the truth.

Id. at 878.

The testimony would therefore not be limited to Galbreth’s credibility but would cover his substantive answers to questions concerning his guilt or innocence. The judge would have allowed the assistant U.S. Attorney to cross-examine Dr. Raskin and to present the Government’s expert to “refute any of Dr. Raskin’s testimony relating to the polygraph technique in general or to the specific application of that technique in this ease.” Id. at 896. There was no mention of permitting the Government to give Galbreth a polygraph exam.

By contrast, another district court in United States v. Crumby, 895 F.Supp. 1354, 1363 (D.Ariz.1995) admitted the evidence with severe limitations while noting that “the prejudicial effect of permitting the jury to hear the specific responses to the question of whether Defendant committed the ultimate crime in the case is overwhelmingly prejudicial.” That is, Crumby could introduce evidence that he took and passed the test if (1) he gave notice to the prosecutor, (2) took a government-administered test, (3) introduced the evidence only to support his credibility, if attacked, under Rule 608(a), and (4) the specific questions and physiological data were not introduced into evidence, although the general nature of polygraphy could be discussed by the experts under Rule 702. Id. at 1365. In Crumby, Dr. Raskin again testified, but unlike the Galbreth prosecutor, the U.S. Attorney did not offer any expert testimony as to the validity of the theoretical basis for the polygraph, nor contest Dr. Raskin’s testimony regarding the known error rate. The Crumby decision failed to mention any of the studies that challenge the validity of polygraph tests.

Galbreth and Crumby are exceptions, even within their own federal circuits, to the general rule that polygraph evidence is not admitted in federal courts. See United States v. Call, 129 F.3d 1402 (10th Cir.1997) and United States v. Cordoba, 194 F.3d 1053 (9th Cir,1999)(barring evidence under Rule 702 due to lack of known error rate for real life exams, controversy in scientific community regarding validity of theory behind test, and lack of controlling standards).

FINDINGS OF FACT

Decision theory and base rates

1. Measuring validity of polygraph test results is crucial to determining their admissibility. The following definitions come from PALD, page 29, et seq.

2. Decision theory is a scientific approach that applies basic statistics to real world problems. It is used to attempt to predict the utility of a test when there is a high degree of uncertainty before a test is conducted.

3. Reliability is a term used to indicate repeatability across different times, places, subjects, and conditions.

Jf. Test-retest reliability is the extent to which the same procedure, including the examiner, test format, and equipment used to examine the same subject for the same purpose yields the same result on repetition.

5. Inter-rater reliability is the extent to which different examiners would draw the same conclusions about a given subject at a given time for a given examination.

6. A measurement is considered valid if it measures what it is supposed to measure.

7. Criterion validity refers to how well a measure captures what it is supposed to capture. In the case of a polygraph test, does it show deception when the test subject is in fact deceptive and show lack of deception when the subject is truthful. This is synonymous with accuracy.

8. Without accuracy or criterion validity no test or procedure can be considered valid.

9. Construct validity refers to how well explanatory theories and concepts account for performance of a test. Users can have greater confidence in a test when evidence of its accuracy is supported by evidence of construct validity. In other words, when there is a chain of plausible mechanisms that explain both the empirical findings of the test and evidence that each test mechanism operates as the theory prescribes.

10. A positive polygraph test result means that the test indicates deception. A negative polygraph test result means that the polygraph indicates no deception. Therefore, a false positive result means the test indicates deception when the test subject is being truthful and a false negative result means the test indicates no deception when the test subject is not being truthful.

11. Decision threshold is the cutoff point for deciding whether a result is positive or negative. Even though polygraph test results, like other diagnostic tests, are usually presented in a yes or no answer format, the actual score is not presented in that fashion. In other words, there is a cutoff point, below which or above which the test is not scored as a positive or negative. These cutoff points are policy choices made by polygraphers. If they are set incorrectly, it increases the chance for a false negative or false positive result.

12. The literature and the presentations focused to a great extent on the issue of base rates. Base rates are an essential element in establishing a level of confidence in the outcome of a diagnostic test. Base rates dictate whether a diagnostic test is worth considering at all.

13. Base rate refers to the proportion of people in a population as they relate to a particular trait -in issue. For example, in polygraph testing, the percent of truth tellers versus deceivers would result in the base rate. While the cases refer to the rate of error, that is not the only number that a court should consider in determining admissibility under Rule 11-403 NMRA 2003. Even though a particular piece of information may have some slight tendency to make the existence of a fact of consequence more or less probable, the confidence one could have in that information in relation to the circumstances of the case may be so low as to render the evidence inadmissible under Rule 11^403 NMRA 2003.

14. The confidence level in decision theory is a function of the error rate and base rate. To be complete in evaluating any diagnostic test, accuracy has two components. In the polygraph context, these components are: How likely is the test to be positive (indicating deception) if lying is present; and, how likely is the test to be negative (indicating a lack of deception) if lying is not present.

15. In the world of medicine, for example, Dr. Zelicoff noted that in diagnosing strep throat that the disease is seasonal. During certain seasons, strep is so rare that the test result does not significantly add to our confidence level. That’s because due to seasonal fluctuation, the base rate of possible strep is so low, that even though the test accuracy is high, a positive test result does not increase our confidence that a decision made based on the test result will be correct.

16. In polygraph use, knowledge of the base rate can help decide whether the result of a polygraph test is worthy of consideration in making an important decision. In the employee screening contest, the NAS focused on base rate since the percentage of spies is assumed to be very low. Dr. Zelicoff quoted the former Secretary of Energy as saying I in 10,000 employees of the Department of Energy are spies.

17. The accuracy rates of polygraph examinations are, at best, debatable in real life contexts. However, even if one assumes a high accuracy rate, the test is of little utility because of the low confidence level in the test result.

18. The NAS noted that if you use a test with 90% accuracy and an 80% threshold value (see p.61, PALD) and the test is used in a population with .1% (one in 1000) spies, the test would identify an average of 1606 as deceptive, only 8 of whom would be spies. PALD p.47.

19. Dr. Iacono used a similar example to illustrate the problem as it might apply in the criminal context. If you assume a base rate of 90% guilty and 90% test accuracy (and a maximum threshold value) and apply those assumptions to 100 criminal defendants who take polygraph tests, the resulting confidence level in the test result is notable. Of the 90 guilty, 81 will fail the test and 9 will pass. The 81 test failures will not be disclosed to the jury, the court or the prosecution, of course, but the 9 passed tests will be disclosed. Of the innocent, 9 will pass and I will fail. The passes will be disclosed and the one failure will not. Of the 18 passed tests, there are only 9(50%) who are factually not guilty. In other words, the confidence level of the test in its application is only 50-50. See Resp. Exhibit 4.

20. Petitioners have some arguments to address this illustration. First, they note that the base rate is not truly knowable. A defendant is, after all, presumed innocent and to clump an individual in with all others accused is to violate basic principles of American jurisprudence. Second, petitioners argue that the standard under Rule 11-401 NMRA 2003, is any tendency to make the existence of any fact that is of consequence to the determination of the action more probable or less probable than it would be without the evidence.

21. The argument points out that even though the confidence level of the test result in the context of these assumptions is only 50-50, it still makes a fact in issue more or less probable. In other words, even though the confidence level is merely 50%, the argument goes, it is still a 40% improvement over the pretest 90% figure. To rephrase, before the test result, given the population, one could be confident that any one of the population who denied culpability was 90% likely to be not telling the truth. After passing the test, one could be only 50% confident that the denial was untruthful. That move from a 90% confidence the testimony is false to a 50% confidence the testimony is false makes it more probable it is truthful than it was before and, so the argument goes, it is relevant.

22. The base rate issue is part of this Court’s analysis of the field study reliability and is a major issue raised directly by Respondents. Therefore its effect must be considered as it relates to polygraph evidence. This Court finds that, i/polygraph testimony is reliable enough to be admissible, it would be deceptive to testify to the type of testimony offered in the past, such as claims that there is a 90% chance the test subject was truthful or that the test is 90% accurate.

23. Dr. Raskin and Dr. Honts both testified that in the absence of a known base rate, a base rate of 50% should be assumed. Both also testified that juries tend to work out their own base rates. In other words, in considering the strength of other evidence, juries give more or less weight to polygraph evidence.

24. The Court agrees that the base rate in an individual case is basically either unknowable or, at best, is a moving target based on the strength of all of the non-polygraph evidence. Yet it exists. To assume a base rate of 50% is no more reliable than any other assumption and is misleading. If any level of accuracy is testified to, it is either directly or inferentially suggestive of a confidence level in the result that is directly tied to a base rate most appropriately to be determined by the finder of fact. If the art of polygraphy were to ever achieve sufficient reliability for admissibility, it would be appropriate to prohibit any percent of accuracy to be introduced on direct examination. In other words, it would be inappropriate to testify that the test reflects a 90% probability that the test subject was truthful if it is not possible to accurately express how confident the jury could be in that number given the population of test subjects. Any probative value of such testimony would be substantially outweighed by the danger of confusion of the issues, misleading the jury, and undue waste of time.

Known rate of error in operation

25. The only way to determine the “rate of error in operation” of the polygraph test procedure is to test the operation of the procedure and determine its reliability or accuracy.

26. To test a theory, one must start with a hypothesis.

27. There is no sound scientific theory upon which polygraph is based.

28. Dr. Honts claims to have a hypothesis that is being tested, that of whether a comparison question test accurately diagnoses truth and deception. However, there is no explanation as to why it does so if it does indeed do so.

29. There is no lie response. There is no one testable physiological manifestation of a lie. Polygraphs test physiological responses to questions and, if there is a physiological response, the thinking is that if the response is greater for a relevant question than for a comparison question, then it means the response to the relevant question is likely to be deceptive. However, any physiological response to any question could be caused by any one of a number of emotions such as shame, anxiety, guilt, fear, tension, or other emotional responses not understood. There is no single underlying process reflected in responses to questions that are measured by the polygraph. The polygraph measures a variety of psychological and physiological processes, including some that can be consciously controlled.

30. In the comparison question test, one emotional or physiological response to the relevant question could cause a measurable result on the polygraph and a completely different emotional or physiological response to the comparison question could cause a measurable result on the polygraph. Yet the level of response for each of the two responses is what is measured and compared, resulting in the gauge of truth-telling.

31. The comparison questions are not determined in advance and are either directed lie or probable lie questions. A directed lie means in the pre-test interview the test subject is told to lie to the question which will supposedly result in the physiological response. A probable lie is similar in operation, but is a question like: “Have you ever taken anything of value that did not belong to you?” Pretest procedures sometimes include card tricks or similar techniques to convince the test subject that the test is working and will detect deception. No standards exist for how the pre-test procedures will be conducted or for how the comparison question will be formulated.

32. The vast majority of the tests upon which the claimed accuracy of polygraph examinations is based are laboratory tests, as opposed to field tests.

33. In most laboratory tests, the subject is given a series of written instructions and during the course of following those instructions will or will not “steal” an item. Then the subject is immediately subjected to a polygraph examination.

34. In most field tests, results of polygraph examinations by various law enforcement agencies are examined to determine if they were correct.

35. The accuracy of a test in the field can only be determined if objective truth is known. If objective truth is not known, then you can not determine if the test accurately detected deception.

36. The method for determining objective truth in field tests is usually based on whether or not there was ultimately a confession either by the subject of the polygraph or by others who then exonerate the test subject. If nobody confesses, then the test result is not considered in determining accuracy.

37. This technique effectively limits the ability to measure polygraph accuracy in the field, since all test results are thrown out if there is not a confession. It is highly unlikely that subjects in a field study would confess if they passed the polygraph. A fair assumption is that a guilty subject would have a vested interest in passing the polygraph. That is one of the ideas proponents assert to argue that the stress of facing the relevant question would result in a more pronounced response than the control question. If it’s so important to pass, why would anyone who’s successfully passed the polygraph in a real life setting then decide to reveal the truth? Why would the subject bother taking the polygraph in the first place if the point wasn’t to try to get away with it? If that assumption is correct, and this Court, based on years of experience on the bench and in a criminal practice, as well as after reviewing all of the evidence and testimony in this case, finds that it is, field studies do not produce a reliable error rate. None of the errors are likely to admit they were “errors”.

38. Conversely, the truly innocent person who is scored as having failed the polygraph examination is also highly unlikely to confess to the crime they did not commit. Again, this error would not reach the final tally of test “success” since the result would not be considered at all as there was no confession. If the innocent person falsely confessed, which appears to happen from time to time, that would also inflate the accuracy figures of the field study and distort the claimed error rate.

39. Experimental field studies are the most compelling type of field validation study. This would be a study in which a variable of interest is manipulated among polygraph examinations in real-life settings. No experimental field studies are found in any of the literature on polygraph validity. PALD at 109-110.

40. At the top of research hierarchy is the peer reviewed publication. No specific-incident field investigations are found in the higher levels of research hierarchy. PALD at 114.

41. The field test results suggest that polygraph examinations are an effective interrogation tool because they seem to produce a significant number of confessions. This utility is separate from polygraph validity. According to NAS: “There is substantial anecdotal evidence that admissions and confessions occur in polygraph examinations, but no direct scientific evidence assessing the utility of the polygraph. Indirect evidence supports the idea that a technique will exhibit utility effects if examinees and the public believe that there is a high likelihood of a deceptive person being detected and that the costs of being judged deceptive are substantial.... there is no evidence to suggest that admissions and confessions occur more readily with the polygraph than with a bogus pipeline— an interrogation accompanying the use of an inert machine that the examinee believes to be a polygraph.” PALD at 214-215.

42. Because there is no underlying theory explaining why polygraphs detect deception, it limits the ability to determine effectiveness in contexts that vary from the lab settings or the limited number of field tests. For example, the majority of polygraph test results offered in evidence in New Mexico (all of the test results in the cases in issue in these appeals) are offered by the defendant.

43. Because laboratory tests are so dissimilar from the complex matrix of variables that can occur in real life, they are not sufficiently useful for determining the accuracy of polygraph testing in real life contexts.

44. The context of a polygraph test offered by a defendant differs in many material ways from the lab setting and field tests. First, the delay between the targeting of the suspect and the test is often significant. Second, the pressure to perform is different since the result of the test will not be disclosed if the defendant fails the test. Third, given the delay, the defendant may become habituated to answering questions about the pending charges and therefore may not react as strongly to relevant questions during the polygraph test. Fourth, the polygrapher is “friendly” to the defense. Fifth, the opportunity for the defendant to learn and utilize counter-measures is increased.

45. An example of the types of problems that are inherent in most laboratory studies was demonstrated by a laboratory study conducted by Dr. Iaeono which was designed to introduce some real stressors into the test dynamic, stressors that are more likely to mimic real life situations. Dr. Iaeono went to a population that Dr. Raskin used for one of his lab studies, prisoners. But instead of using the traditional Raskin approach of offering a nominal financial reward if the test is “beaten”, Dr. Iaeono generated some real pressure. He told the prisoners that he would pay them if they “beat” the polygraph, but that the payment would be to all of the prisoners or none. He told them that he expected a certain percentage to be successful and that if they fell below that percentage nobody would get paid and he would publish the names of the prisoners who failed to pass the polygraph in the prison. At the conclusion of the test he paid everyone and didn’t publish any names. However, the test accuracy fell from Dr. Raskin’s 94% to 72%, even though it was the same population group. As Iaeono described it, he set up a group contingency threat, where each test subject would be concerned about the consequences of the test outcome. The study was published in The Journal of Applied Psychology, a peer reviewed publication. TT, 6/24/03, 46^8.

46. The Iaeono prisoner study is one example of what can happen if a key and relevant variable is altered to more closely approach real life. Unfortunately, there are not enough studies that try to answer these types of questions.

47. No scientific field studies of the friendly polygrapher scenario have been conducted. Given the variables, the risk of significant impact is great. In the normal scenario, the scenario from which the field studies have been derived, the test is conducted in an adversarial setting. The goal of the police officer conducting the test is to catch somebody. The focus is intense and the consequences of failing the polygraph are great.

48. In the friendly polygraph there is no adversarial atmosphere.

49. The Rosenthal Effect is a phenomenon that has been recognized in psychology for approximately thirty years. It recognizes that psychologists and scientists and others who have an investment in a theory are likely to unconsciously arrange an experiment in such a way that they get favorable results. It is the reason that it is necessary that test re-suits need to be replicated by an independent researcher.

50. The Rosenthal Effect can affect an individual polygraph examiner because the hypothesis in an individual test involves the examiner’s sense of whether the test subject is guilty or not. The examiner necessarily has access to the case facts and interviews the examinee in a pre-test interview. Based on the case information and how the interview develops — for example the examinee might seem truthful — it can affect the attitude of the examiner. The Court noted the following statement from Dr. Honts: “In my experience in New Mexico in testifying before juries clearly indicates that, (the jury will make use of the polygraph as they see fit) and that they have decided to convict despite a polygraph that showed the person was truthful.” TT, 7/3/03, 114. The context of the statement and the observation of the witness led the Court to conclude that Dr. Honts was invested in the outcome and that he was surprised that a jury could reach a different conclusion.

51. The risk of the Rosenthal Effect is exacerbated by the lack of standards in the profession.

52. There is no requirement that the test subject be drug free. However, drugs that act to decrease responding in a general way will not normally affect the control question test because the scoring is based on comparing responses to two types of questions. The problem is, there is at least one study that indicates that alcohol could reverse the responses in a control question setting. Dr. Iacono was unable to duplicate the result of the study. More research needs to be done in this area.

53. Since it is not clear what emotional triggers will result in a particular reading in a polygraph chart and since different emotions may produce a given polygraph response in the control versus the relevant question, there is no way to determine if the drug may affect one emotional response, but not another.

54. There are no standards which dictate whether an examiner should use a probable lie versus a directed lie versus a relevant-irrelevant test.

55. There is no restriction regarding testing mentally ill individuals. However there is at least one study that indicates that psychopaths are not more able to defeat the polygraph than others.

56. While there are supposed guidelines that dictate the form of relevant question, they seem to be subject to unreasonable interpretation by practitioners. Dr. Raskin, on the one hand takes the position that intent is not a proper subject for a relevant question, yet claims that asking a relevant question regarding whether touching the victim’s penis was for “sexual purposes” is not problematic. TT, 7/1/03, 217-218. (Regarding the questions asked in State v. Robinson, one of the pending cases).

57. At this point there remains no licensing requirement for polygraphers in New Mexico.

58. There is no blind proficiency testing requirement in New Mexico.

59. Covert counter-measures consist of simple techniques such as biting the tongue, flexing the toes, or performing mentally stressful math exercises. These activities, if timed to take place during the control question phase of the test, can artificially augment the “involuntary” physiological response.

60. Counter-measures are effective in affecting polygraph test outcomes. One laboratory study indicates that with less than a half hour training or explanation, the likelihood of a false test result increases by 50%. There is a consensus among scientists that counter-measures are effective. Some studies indicate that merely reading about countermeasures is insufficient to affect test outcomes, but more research is necessary in this area. See, State v. Porter, 241 Conn. 57, 113, 698 A. 2nd 739, 768 (1997).

61. This Court shares the concern of the Connecticut Supreme Court in Porter, noting the informal study cited in that ease where twenty-seven inmates were given fifteen minutes of instruction by a fellow prisoner (who had been instructed by Dr. Lykken) before reporting for a polygraph exam regarding an alleged infraction of prison rules. All twenty-seven privately admitted their guilt and twenty-four passed the polygraph. Id., at 241 Conn. 114, 698 A.2d 768. Although that study is appropriately criticized by Dr. Raskin, see, Faigman, § 19-2.2.2 FN 72, the specter of the ease of communicating how to successfully utilize counter-measures remains.

62. Experienced examiners could not detect counter-measures in the lab study.

63. There are no properly conducted studies regarding the effectiveness of counter-measures in real life by sophisticated test subjects.

64. In PALD, the authors note: “Notwithstanding the limitations of the quality of the empirical research and the limited ability to generalize to real world settings, we conclude in populations of examinees, such as those represented in polygraph research literature, untrained in counter-measures, specific instance polygraph tests for specific investigations can discriminate lying from truth well above chance and well below perfection, and accuracy may be highly variable across situations.” Id. at 214.

65. However, there is no guarantee that the populations of test subjects that are likely to offer the test in evidence in New Mexico are “untrained in countermeasures.” Also, it must be kept in mind that the context of all of the research referred to was in relation to specific investigations in either laboratory settings or field studies based on adversarial test situations. As a result, the conclusion that tests in those situations can discriminate lying from truth “well above chance” is irrelevant to the inquiry of this Court.

66. Computer scoring of test results is a recent development. However, the algorithms for the programs are based on certain assumptions:

• that the probability of truth or deception in real-world situations can be determined from the score on a control question test (the basic assumption of lie detection);

• that the scores stored in the computer accurately represent the scores to be expected from truthful or deceptive subjects obtained under circumstances similar to those in the instant test;

• that 50 percent of those who are tested with the instrument are deceptive (the base rate problem discussed elsewhere)

See, Faigman, § 19-3.3.9. Because of the problems with field studies no database meeting the above criteria exists. The computer scoring results in an expressed confidence level presented as a percent likelihood that the test subject is truthful. Examiners will testify, for example, that the test score shows the likelihood that the subject was truthful is 93.3%. As discussed above, this is without a scientific basis and deceptively ignores the problem with base rates.

Acceptance in the Relevant Scientific Community

4. The relevant scientific community is The Society for Psychophysiological Research and Fellows in Division One of the American Psychological Association, a division of the American Psychological Association General Psychology Group broadly versed in principles of psychology-

5. There have been four attempts to survey the relevant scientific community for its views of the validity of polygraph examinations.

6. Of the four attempts, the most reliable is the survey conducted by Dr. Iacono and published in The Journal of Applied Psychology, a peer reviewed publication.

7. While Dr. Honts is critical of the methodology, the response rate was the highest by far, and the survey clarified potential ambiguities found in the other surveys. The Court finds it significant that the article relating to the Iacono survey and the results were selected by the publisher of a book on research methodology to be used as an exemplar of how to do similar types of research. Further, unlike the Iacono survey, the other surveys did not distinguish between control question tests and guilty knowledge test.

8. 36 % of those responding felt the control question polygraph test was based on scientifically sound psychological principles and theory. This compares with 22% who agreed with that statement regarding the directed lie test and 77% who agreed with the question in the guilty knowledge test.

9. A significant majority also agreed that a “friendly” test was more likely to be passed than an adversarial test. 99% believed that counter-measures might work.

10. On the issue of the weight to be given laboratory studies as opposed to field studies, only 17% believed that results of laboratory studies should be given substantial weight.

11. The Iacono survey results were consistent with the NAS view that the high levels of accuracy claimed by practitioners have rarely been reflected in empirical research. NAS, p. 107.

12. Control question polygraph tests do not enjoy general acceptance within the relevant scientific community.

13. This finding is even more significant given the length of time the polygraph has been in use. The polygraph is not “cutting edge” technology that would tend to be esoteric. It is technology that would be familiar to members of The Society for Psychophysiological Research and Fellows in Division One of the American Psychological Association.

CONCLUSIONS OF LAW

1.Polygraph test results and the conclusions derived from them are not based upon an overarching theory. To the extent it is merely argued that there is a hypothesis that the test reliably detects deception, that hypothesis has not been subjected to field research. The existing laboratory research, given the problems described above, is woefully inadequate to support admissibility in court in real life contexts.

2. There is no theory, as stated above. The technique has been subjected to limited peer review publication. The conclusions of the relevant publications do not enhance confidence in the test results, particularly considering the effectiveness of counter-measures.

3. The potential rate of error is vague and unreliable. Given the effect of ignoring base rates as endorsed by proponents, the reliability of test results as reflected in an actual percentage misrepresents the confidence level in the test.

4. There are no set standards other than those set out in Rule 11-707 NMRA 2003. Those standards are insufficient for the reasons set out above.

5. Control question polygraph tests are not accepted in the relevant scientific community at a significant level, particularly considering the age of the technique.

6. The technique is not based upon well-recognized scientific principles and is not capable of supporting opinions based upon reasonable probability rather than conjecture.

7. If the risk of counter-measures is ignored, there is an argument that all of the studies taken together support a conclusion that a successful polygraph result makes a fact in issue more or less probable. However, given the state of the art of polygraphy, the limited probative value polygraph test results is substantially outweighed by the danger of confusion of the issues, undue delay, and waste of time and therefore polygraph evidence becomes inadmissible under Rule 11 — 403 NMRA 2003.

8. At least one court has found that testimony that someone has passed a polygraph examination is extrinsic evidence of a specific instance of conduct (passing the polygraph) that supports a witness’s credibility, and is therefore inadmissible under Rule 11-608 B. US v. Piccinonna, 729 F.Supp. 1336, 1338 (S.D.Fla.1990), aff'd by U.S. v. Piccinonna, 925 F.2d 1474 (11th Cir.1991).

9. Because of the inherently subjective nature of the test procedure, the polygraph examination can not be repeated. Successful repetition of a test is the cornerstone of the scientific method. It lacks test-retest reliability.

10. The results of polygraph testing are not sufficiently reliable for admissibility in courts in New Mexico.

/s/ Richard J. Knowles

Richard J. Knowles

District Judge 
      
      . NAS, The Polygraph and Lie Detection 12-13, 81 (2003)
     
      
      . Id. at 13.
     
      
      . Id.
      
     
      
      . Id. at 11-12.
     
      
      . William G. Iacono and David T. Lykken, The Scientific Status of Research on Polygraph Techniques: The Case Against Polygraph Tests, § 19— 3.3.4[5], in 2 MODERN SCIENTIFIC EVIDENCE: THE LAW AND SCIENCE OF EXPERT TESTIMONY (David L. Faigman, David H. Kaye, Michael J. Saks & Joseph Sanders eds., 2002)
     
      
      .Charles R. Honts, David C. Raskin, & John C. Kircher, The Scientific Status of Research on Polygraph Techniques: The Case for Polygraph Tests, § 19 — 2.2.3[1], in 2 MODERN SCIENTIFIC EVIDENCE: THE LAW AND SCIENCE OF EXPERT TESTIMONY (David L. Faigman, David H. Kaye, Michael J. Saks & Joseph Sanders eds., 2002)
     
      
      . Iaconoati 19 — 3.1.1 [1].
     
      
      . PALD at 253.
     
      
      . Honts at § 19 — 2.1.2[3]
     
      
      . Iaconoat§ 19 — 3.1.1 [2] [A ].
     
      
      . The Legal Relevance of Scientific Research on Polygraph Tests, Per se Exclusion § 19-1.2.1 in 2 MODERN SCIENTIFIC EVIDENCE: THE LAW AND SCIENCE OF EXPERT TESTIMONY (David L. Faigman, David H. Kaye, Michael J. Saks & Joseph Sanders eds., 2002)
     
      
      . Faigman et al. at § 19-1.2.3, fn. 73 and 74.
     
      
      . Faigman et al. at § 19-1.2.2.
     