
    Commonwealth v. Stanley Slonka
    Superior Court, Hampden, SS
    No. 911479
    Memorandum Dated January 20, 2000
   Wernick, J.

On April 28, 1991, Karen Mills (“Mills”), a prostitute addicted to cocaine, entered a pickup truck, was driven to a remote area and was viciously stabbed in the abdomen with what appeared to be hedge shears and left to die. She was discovered by a passerby and taken to a nearby hospital where her life was saved. Mills subsequently identified the defendant, Stanley Slonka (“Slonka”), as her attacker. Slonka was arrested on June 5, 1991, and was arraigned on June 11, 1991, in the Hampden Superior Court on indictments charging armed assault with intent to murder and assault and battery by means of a dangerous weapon. A jury returned verdicts of guilty on both indictments on March 13, 1992. Slonka was sentenced to concurrent terms at MCI Cedar Junction of 15 to 20 years on the indictment charging armed assault with intent to murder and 9 to 10 years on the indictment charging assault and battery by means of a dangerous weapon. On May 30, 1997, Slonka’s conviction was reversed and the case was remanded for new trial. Commonwealth v. Slonka, 42 Mass.App.Ct. 760 (1997), farther appellate review denied, 425 Mass. 1108 (1997).

On March 16, 1999, Slonka was administered a polygraph examination by James A. Johnson, Jr., (“Johnson”) at the law offices of Slonka’s attorney, William O’Neill. This was a confidential polygraph examination, meaning that if Mr. Slonka failed the examination, the results would remain privileged and confidential. Johnson used the control question technique for the examination. There are various types of control question examinations. The technique used by Johnson is known as the probable lie comparison question technique. This technique involves asking the subject relevant questions and comparison questions. Relevant questions deal with the specific accusations at issue and are interspersed with control questions during the polygraph test. In this case, the relevant questions were (a) did you on April 28, 1991, stab Karen Mills? (b) did you on April 28, 1991, stab Karen Mills near the Connecticut River? (c) was Karen Mills on April 28, 1991, ever in your pickup truck? Control or comparison questions are designed to elicit probable lies from the subject on matters remote in time and subject matter from the relevant questions. An example of a control or comparison question used during Slonka’s examination is “before the age of 20, other than what you told me about, did you ever lie to a priest or nun?” Throughout the examination, the polygraph device measures certain involuntary physiological responses of the subject during each question. These responses are respiratory activity, sweat gland activity, heart rate and blood pressure, or, more scientifically, electrodermal, pneumatic and cardiovascular activity.

The control question technique was introduced in 1947 by John Reid. The technique is based upon the emotion of fear as the dominant factor in the psychology of polygraph. The psychological theory, according to Johnson, “is that fear of entrapment and exposure experienced by the living individual will produce autonomic (uncontrollable) physiological responses which can be recorded with the polygraph instrument and interpreted by the examiner. After each question is asked by the examiner, the examinee will experience physiological changes, which are transmitted through the instrument to individual ink pens, which record the changes automatically and simultaneously on moving chart paper.” The basic premise is that guilty individuals will show stronger responses to relevant questions than to the comparison questions, while innocent individuals will respond more strongly to the comparison questions, since the comparison question presumably will pose a greater threat to the innocent individual than to the guilty person.

The first part of a control question polygraph examination consists of a pre-test interview to obtain identification and biographical information, establish rapport between the participant and the examiner, review the questions, both relevant and comparison, with the subject and explain the instrumentation portion of the examination to follow. The pre-test is essential for a number of reasons, including, but not limited to, determining whether the subject is a suitable subject for polygraph examination, conditioning the subject, persuading the subject that the examiner will be able to detect untruthful responses, designing the comparison questions and insuring that the subject will not be confused in any way by any of the questions. Failure to conduct an adequate pretest will result in an unreliable polygraph examination.

Slonka’s responses to each of the relevant questions was “no.” Johnson determined by examining the readout from the polygraph machine that Slonka’s responses were consistent with truthfulness.

Before Johnson administered the polygraph examination to Slonka, a study was commissioned to determine Johnson’s accuracy as a polygraph examiner. This study was performed by Professor Jack Levin and his assistant Janese Free, at Northeastern University, between January 1999 and March 1999, The unpublished report of this study describes the study as follows:

The accuracy of polygraph examinations administered by examiner James A. Johnson, Jr. was tested with a sample of 20 students at Northeastern University. The study utilized the well-established “mock crime” model. Half of the participants were randomly assigned the “guilty” condition and instructed to commit a mock purse snatching. Half of the participants were randomly assigned the “innocent” condition and did not commit a mock crime. All participants were instructed to deny committing the purse snatching when polygraphed. The examiner, who had no knowledge of the guilt or innocence of the participants, conducted a control question test polygraph to determine whether each participant was being deceptive. By monitoring and recording the participant’s electrodermal, pneumatic, and cardiovascular activity during the examination using relevant and control comparative questions, quantitative evaluations of the physiological responses were made. Excluding the 3 in-conclusives and 1 invalidated test, the examiner correctly detected 7 out of the 7 deceptive subjects, and 8 out of the 9 truthful subjects.

Each subject was paid $8.00 per hour for his/her participation. A $50.00 bonus was offered each subject who passed the test. Further details of this study are described later in this memorandum.

Slonka has now moved that the results of his polygraph examination of March 16, 1999, be admitted at his retrial. He has also moved that an additional polygraph examination be conducted by the Commonwealth’s polygraph examiner, or by another neutral examiner, and that the results of that examination also be admitted, whatever they turn out to be. Evidence was presented to the court over three days and numerous memoranda were filed by the parties. After consideration of all of the evidence and the parties’ memoranda, the court concludes that Slonka’s motion must be denied in its entirely.

“Polygraph evidence, with or without pretest stipulation, is inadmissible in criminal trials in the Commonwealth either for substantive purposes or for corroborating or impeachment of testimony.” Commonwealth v. Mendes, 406 Mass. 201, 212 (1989). Mendes applied the standard established in Frye v. United States, 293 F. 1013 (D.C. Cir. 1923), and Commonwealth v. Fatalo, 346 Mass. 266 (1963). That standard was that “(j Judicial acceptance of a scientific theory or instrument can occur only when it follows a general acceptance by the community of scientists involved.” Commonwealth v. Lykus, 367 Mass. 191, 196 (1975), quoting Fatalo, supra at 269. Noting that “(W]e have never determined that the appropriate scientific community, which includes physiologists and psychologists, has generally accepted the validity of polygraphy as a scientific means of detecting deception,” Mendes, supra at 207, the Court concluded that “[FJifteen years has been more than enough time for examination and evaluation . . . (O)ur hope that poly-graphy would mature to the point of general scientific acceptance has not materialized.” Id. at 212.

The standard for admissibility of scientific evidence has changed since Mendes. It is that change that Slonka seizes upon to argue that the admissibility of polygraph evidence is again an open question in this Commonwealth. In Commonwealth v. Lanigan, 419 Mass. 15 (1994), the Court rejected strict adherence to general acceptance as the sole determinant of admissibility, holding that “(t]he ultimate test ... is the reliability of the theory or process underlying the expert’s testimony.” Id. at 24. The Court, however emphasized that “[w]e suspect that general acceptance in the relevant scientific community will continue to be the significant, and often the only issue. We accept the idea, however, that a proponent of scientific opinion evidence may demonstrate the reliability or validity of the underlying scientific theory or process by some other means, that is, without establishing general acceptance.” (Emphasis added.) Id. at 26. Other factors may be considered by a trial court in determining the reliability of the scientific theory or process underlying proffered opinion testimony, including whether the theory or process can be or has been tested; whether the theory or process has been subject to peer review; the known or potential error rate of the process; and the existence of standards controlling the process’ operation. See Daubert v. Merrill Dow Pharmaceuticals, Inc., 113 S.Ct. 2786, 2796-97 (1993), cited with approval in Lanigan at 25-26. A court, furthermore, is free to consider other factors that the court believes are appropriate to a determination of reliability of the proffered evidence in the particular case. See Kumho Tire Company v. Carmichael, 119 S.Ct. 1167, 1175 (1999).

After Lanigan, the Supreme Judicial Court revisited the issue of the admissibility of polygraph evidence in Commonwealth v. Stewart, 422 Mass. 385 (1996). Acknowledging that “the reliability of the theory underlying polygraph testing may now be shown by means other than general acceptance in the relevant scientific community,” the Court concluded that “the defendant [must show], by proposed proof or by reference to scientific or legal publications, that the perception of the reliability of polygraph evidence has, for good reason, changed significantly since our Mendes opinion." Id. at 389. In dicta, the Court also suggested that “[i]f polygraphic evidence is to be admissible in a given case, it seems likely that its reliability will be established by proof in a given case that a qualified tester who conducted the test had in similar circumstances demonstrated, in a statistically valid number of independently verified and controlled tests, the high level of accuracy of the conclusions that the tester reached in those cases.” Id. at 389.

With these principles in mind, this court will first examine the reliability of polygraph evidence under Lanigan, employing the Daubert factors and other factors the court considers relevant to the issue of reliability. Focus will then shift to the Stewart analysis of the validation of Johnson’s accuracy as an examiner and finally to other considerations relevant to the admissibility of polygraph evidence.

I. LANIGAN ANALYSIS

A. GENERAL ACCEPTANCE

No persuasive evidence has been presented in this case of any significant movement toward general acceptance of polygraph examinations within the relevant scientific community since Mendes. In order to determine the existence of general acceptance, the Supreme Judicial Court looked not only to the relevant scientific community, but also to whether courts were generally admitting polygraph examination results into evidence. Mendes, supra at 205, 209. While a few courts since Mendes, which had not previously admitted polygraph examination results, now do so, most courts continue to exclude them. There has been no move toward general acceptance within the courts.

All of the concerns expressed in Mendes remain today. There is still no consensus among experts as to the accuracy of polygraph testing to detect deceit. The literature continues to note.that the error rate for innocent subjects who take the test is higher than for guilty subjects. Furthermore, surveys of the relevant scientific community both before and since Mendes have failed to demonstrate any movement towards general acceptance. Four such surveys have been brought to the court’s attention. The first two were surveys of members of the Society for Psychophysio-logical Research (SPR), a much narrower group of scientists than the physiologists and psychologists identified in Mendes as included with the relevant scientific community. The first survey was conducted by the Gallup organization in 1982, before Mendes. 13 Polygraph, 153 (1984). Approximately sixty percent of the respondents replied that polygraph examination was a useful diagnostic tool when considered with other available information. Thirty-three percent found it of questionable usefulness, entitled to little weight. Three percent found it of no usefulness. One percent found it sufficiently reliable to be the sole determinant.

In 1993, Susan L. Amato, as an unpublished masters thesis, conducted another survey of the same organization asking the same four questions. Amato, A Survey of Members of the Society for Psychophysiological Research Regarding the Polygraph: Opinions and Implications (1993). The results were strikingly similar. Ignoring for the moment that this survey is unpublished, it demonstrates at best no meaningful change in acceptance of the polygraph by those most likely familiar with it in the ten years since the Gallup survey and, therefore, no change since Mendes.

The last two surveys were reported in Iancono and Lykken, The Validity of the Lie Detector; Two Surveys of Scientific Opinion, 82 Journal of Applied Psychology, 426-33 (1997). Noting that the previous two surveys had not distinguished between types of polygraph examinations or the end use of the examination (such as screening or as evidence at trial), Iacono and Lykken’s surveys were considerably more specific. One survey was of SPR. The second survey was of psychologists who had been elected as fellows of general psychology of the American Psychological Association (APA). When asked whether they would permit the control question examiner to testify before a jury as to his opinion whether a defendant was deceptive or truthful when denying guilt, approximately seventy-five percent of both organizations responded “no." It is abundantly clear, therefore, that the control question technique of polygraph examination has not gained general acceptance within the relevant scientific community since Mendes. Indeed, the most recent surveys suggest that it is less generally accepted than previously believed.

B. CAN THE THEORY BE EMPIRICALLY TESTED

Slonka has failed to demonstrate that the basic assumptions of the control question method are grounded in scientific theory. The primary reason for this is that the theory has not been, and may not be capable of being, tested in real-life conditions. The major hurdle to such testing is the difficultly of establishing ground truth. In order to determine whether someone is in fact lying, it is necessary to know the truth with certainty (ground truth). Yet when the results of polygraph examinations are reviewed in real-life situations, it is impossible to know what the truth is with certainty. Therefore, most real-life studies that have been conducted and that have been described to the court involve reviewing polygraph examinations performed on individuals who have subsequently confessed. It is not appropriate, however, to generalize from such studies the accuracy rate of polygraph examinations on all subjects in real-life situations.

The test subjects, convicted criminals who have taken polygraph tests and subsequently confessed, are not fairly representative of the general population, or even of the population of criminal suspects. Studies based upon such subjects are very likely tainted by sampling bias. A confession is no guaranty that the person confessing is in fact guilty. Such studies, furthermore, tell us nothing about how many individuals lied to the polygraph examiner and got away with it; nor do they tell us anything about how many individuals told the truth to the polygraph examiner and were wrongfully found to be lying. Indeed, there appears to be no way to field test those who have successfully lied to the polygraph examiner through the employment of countermeasures or by other means. Finally, the evidence fails to demonstrate that the studies relied upon by the defense involved videotaped examinations. Without videotapes of the pre-test and examination itself, it is impossible to know to what extent an appropriate pretest was conducted, whether visual clues or other clues as to credibility affected the examiner’s conclusions as to credibility, or whether the examiner’s questions were posed so as to encourage or evoke particular responses.

Laboratory studies, at least those brought to the court’s attention, while solving the problem of establishing ground truth, are too contrived to test reliably the accuracy of the polygraph in real-life situations. These studies usually involve the commission of a mock crime by some segment of the subjects to be tested. All subjects are instructed to deny having committed the crime. The crime may involve a theft of personal property from an unprotected place or from an individual, such as a purse snatching. Those subjects who will commit the mock crime are selected at random. All subjects are generally paid for their time and are offered an incentive, usually a monetary amount of $50 or less, for passing the test. The subjects are then tested by the polygraph examiner who attempts to determine which subjects committed the crime.

Such laboratory studies suffer from a number of shortcomings. First, and perhaps most important, the subjects of the study have no significant personal stake in the outcome. These individuals are not criminals, since they have not actually committed a real crime, nor are they individuals who have been wrongfully accused of a crime. At worst, they stand to lose $50 if they are unable to pass the test. The subjects, often students, are involved in a game, not in a real life crisis, the consequences of which could be prison. No research has been presented providing a scientific basis for concluding that real criminals respond to polygraph examinations the same way mock criminals do, or that truly innocent individuals facing imprisonment respond the same way to polygraph examinations as innocent subjects do who may be losing at most the opportunity to make $50. It is hypothesized that the heightened stress of the real-life situation would make the physiological responses to polygraph examinations more pronounced and, therefore, even more accurate. No scientific studies, however, have been presented corroborating this hypothesis. There is no reason to assume that real criminals respond to polygraph examinations the same way as non-criminals. Likewise, there is no reason to assume that truly innocent individuals react the same way to polygraph examinations when facing serious punishment as do those facing no significant consequence.

Laboratory studies are deficient in other respects as well. In laboratory studies, unlike in real life, both the examiner and the subject know that the truth can be ascertained with certainty. The subjects are usually aware that the mock crime has been videotaped or witnessed. How this might affect the physiological responses of the subject has not been tested scientifically in any studies brought to the court’s attention. The mock crimes are not serious crimes resulting in significant damage to property or to persons. The crimes are not violent crimes. The subjects are usually tested within moments after committing the crime, although there was testimony of a study where the test was conducted approximately six weeks after the mock crime was committed. The subjects are often students, which creates a problem of sampling bias. The subjects have no incentive to lie to the examiner about factors that might affect their physiological responses to the questions. These factors include, but are not limited to, the use of medications, drugs, alcohol, the existence of psychiatric disorders, prior exposure to polygraph examinations, training in countermeasures, self hypnosis, or training in meditation. In the court’s view, each of these circumstances must be studied and tested alone and in combination scientifically to determine what effect if any it has on the physiological responses of the subject to the examiner’s questions and on the accuracy of the polygraph examination itself. No evidence of such complete scientific studies has been presented.

Even if all of these problems could be overcome, another serious obstacle remains. Due to the inherent subjectivity of the polygraph examination, it is unclear that studies confirming the purported accuracy of polygraph examinations are confirming the validity of the underlying scientific hypothesis. Each polygraph examination involves not only obtaining physiological responses recorded by the machine; but also the preconditioning of the subject through an extensive pretest as well as careful observation of the subject during the actual test. An experienced polygrapher may well have developed skills observing body language, voice modulation and other visual and oral clues which assist him in determining the truthful subject. It seems impossible to separate the subjective impressions developed by the polygrapher during the pretest and the examination from the physiological responses recorded on the machine and, therefore, to determine whether the underlying scientific hypothesis is in fact valid. Perhaps the most vivid example of the problem comes from the testimony of Johnson himself. Johnson refuses to employ computer polygraph examination scoring programs, because he has found that programs based on differing algorithms have produced opposite results analyzing precisely the same data. In other words, one program examining the readout from the machine finds the subject to be telling the truth, while another program examining the same readout finds the subject to be untruthful. What more compelling proof could there be that the physiological responses of the subject alone, even within the context of the comparison and relevant questions, are insufficient to determine the subject’s truthfulness. The subjective impressions of the examiner appear to be indispensable.

C. KNOWN OR POTENTIAL ERROR RATE

While proponents of polygraph examinations point to high accuracy rates obtained by averaging the results of several studies, the reality is that the error rate of polygraph examinations in real life situations cannot be reliably determined and the error rate of any particular exam is, therefore, unknown. First, as discussed above, the results of field studies and laboratory studies to date are not transferable to real life. Generalizing from any particular study or groups of studies to real life situations is impossible. Consequently, the data upon which experts like Slonka’s expert, Dr. Charles Honts (“Honts”), rely to support their claims of error rates is not helpful in determining error rates in real life situations.

Many variables impact the reliability of any particular exam. These include, but are not limited to, variations in the examiner’s skills; the subjectiveness of the exam; the susceptibility of the subject to pressure or to suggestion during the exam; the ability of the subject to control the results through such things as countermeasures; the setting of the exam; and the adequacy of the pre-test and the design of the comparison questions for the exam. Indeed, Dr. Honts has conceded that a subject who has received only 25 minutes of training regarding nonphysical countermeasures is able to defeat the polygraph approximately 50 percent of the time. These nonphysical countermeasures, furthermore, are virtually impossible to detect. Consequently, there is no way to know whether a particular subject, who has had the time and opportunity to do so, has learned countermeasures which will permit him to defeat the polygraph. Under such circumstances, a polygraph examination would be meaningless. Finally, Dr. Honts has conceded that there is no reliable research regarding the effect of a previous polygraph examination on the subject’s performance during a subsequent polygraph examination. To underscore Dr. Honts’ concern over a subject’s prior exposure to polygraph examination or to information regarding the polygraph process and countermeasures, Dr. Honts and defense counsel requested that the defendant leave the courtroom during substantial portions of Dr. Honts’ testimony as well as that of Johnson.

Perhaps most troubling is the substantial variability in accuracy rates across the field. The defendant’s expert Dr. Honts has identified eight high-quality laboratory studies involving the control question technique. The Polygraph in 1995: Progress in Science and the Law, 71 North Dakota Law Review, 987 (1995). Five of the studies had been conducted before Mendes. Those five studies disclose considerable variation between studies in accuracy rates, even when corrected for inconclusives, as well as considerable variation in the percentage of inconclusives. The three post-Mendes studies show the lowest accuracy rates of any of the studies and likewise demonstrate considerable variation between studies in accuracy rates. No scientific theory or scientifically proven hypothesis was provided to the court to explain these substantial variations. In the absence of evidence establishing the reasons for these variations, such substantial variations among researchers casts doubt on the efficacy of the test and the validity of its underlying theory as well. Commonwealth v. Graham, Berkshire Superior Court, Crim. #96-259 to 263 (Spina, J., Findings, Rulings and Order on Defendant’s Motion to Admit Polygraph Evidence, 7/3/97). Without knowing why the results vary so significantly from study to study, it is impossible to know whether any given test falls within the highest accuracy range reported, or the lowest, or somewhere in between. There is, therefore, no method of verifying the accuracy, or even the probability of accuracy, in any particular case.

D.PEER REVIEW

Both proponents and opponents of the polygraph have published extensively in peer reviewed journals. Articles praising the polygraph and its accuracy may be found in the same journals containing articles describing the polygraph as unreliable and criticizing the research upon which claims of its accuracy are based. This was the situation pre-Mendes and it continues to be the situation post -Mendes.

E.STANDARDS

There are no generally accepted standards for the training or qualifications of polygraphers, or for the administration of polygraph examinations. As Dr. Honts noted in his North Dakota law review article, “[djuring the last decade, various commentators have derided the quality of the training and competence of the average field polygraph examiner. Honts and Perry were particularly pessimistic that internal forces in the polygraph profession would be able to bring about significant change in this area. Given the poor general state of examiner training in the polygraph profession, the assessment of competence of the specific examiner offering evidence is critically important.” 71 North Dakota Law Review, supra at 998. The pre-test interview is absolutely critical to the accuracy of the polygraph results. The “examiner must introduce the control questions in such a way as to carefully manipulate the subject such that his answers are deceptive or likely to be deceptive. Second, if the examiner does not review the questions in advance, they will come as a surprise and may elicit the same kinds of reactions that would arise if the subject is being deceptive. Under such circumstances it is impossible to distinguish between the two potential classes. Third, if a question is asked for the first time on the test, the subject may have to analyze the meaning of the question in order to formulate an answer. This process of cognitive appraisal can cause substantial reactions which may be indistinguishable from a reaction caused by deception. Fourth, there may be terms of the question that are ambiguous which if not clarified during the pre-test interview may cause a reaction indistinguishable from a reaction caused by deception. It is extremely important that all of these potential problems be eliminated in advance of the test, otherwise the task of determining whether the subject’s reactions are produced by deception or by some other factor is impossible.” United States v. Galbreth, 908 F.Sup. 877, 884-85 (D.N.M. 1995, cited by Defendant). These and other concerns about the administration of the pre-test demonstrate that proper administration of the pre-test is an indispensable component of the polygraph examination. The absence of generally accepted standards for the training and qualifications of polygraphers and for the administration of control questioned polygraph examinations is another reason why it is not possible to draw meaningful conclusions from field and laboratory studies, which may employ different protocols for conducting the examination and which may involve examiners with differing training and qualifications, and why meaningful conclusions about error rates of a particular examination yet to be performed may not be reached by averaging the results of previous studies.

F.OTHER FACTORS

While the factors discussed previously are the factors specifically mentioned in Daubert for analysis of the reliability of scientific evidence, courts are permitted to consider additional factors relevant to an assessment of reliability in the particular context of the case at hand. Two such factors are worthy of mention here. First, the fact that polygraph examinations are currently in use outside of the courtroom for different purposes is not conclusive evidence that the polygraph is sufficiently reliable for use in the courtroom. Dr. Honts himself has serious doubts that the polygraph is appropriately used in matters relating to national security. Dr. Honts believes that those posing the greatest security risks (professional spies) are quite likely to have received training in countermeasures which would enable them to fool the examiner. Furthermore, the vast majority of SPR members themselves do not favor admitting into evidence the polygraph examiner’s opinion as to the truthfulness of the subject’s denial of guilt.

Second, polygraph examination appears to lend itself to abuse. Countermeasures can be learned in less than 30 minutes which will permit the subject to fool the examiner approximately 50 percent of the time. Such countermeasures are undetectable. In the real world, there is virtually no way for the examiner to determine whether the subject has received such training or has taken one or more polygraphs in the past. As noted previously, there is no scientific data regarding the effect of prior polygraph examinations on a subject. Dr. Honts admitted that he personally was fooled by a double murderer he found to be truthful, because the individual had trained himself in self hypnosis and was therefore able to beat the polygraph test administered by Dr. Honts.

For all of the above reasons, the court concludes that the defendant has failed to prove that polygraph evidence based upon the control question technique is sufficiently reliable to qualify for admission under Lanigan.

Even if polygraph evidence based upon the control question technique was sufficiently reliable to qualify for admission under Lanigan (which it is not), the particular polygraph examination of Slonka previously conducted by Johnson must be excluded. Both Johnson and Dr. Honts acknowledged the critical role of the pre-test in any polygraph examination based upon the control question technique. Some of the reasons for the pre-test’s pivotal role have been discussed previously in this opinion. Incredibly, however, there is no record of the pre-test administered by Johnson to Slonka. Neither the pre-test, nor the test itself, was videotaped; nor was an audio recording prepared; nor was a written record maintained. Johnson is now unable to testify from memory to the details of that pre-test. Admission of the polygraph examination results under these circumstances would deprive the Commonwealth of any meaningful opportunity to cross examine Johnson regarding the pre-test or to retain its own expert to evaluate the pre-test and test procedure actually employed by Johnson. Johnson’s failure to require that the pre-test and test be videotaped is particularly troublesome since he had just participated in the Northeastern study to verify his accuracy and one of the essential protocols of that study was the videotaping of the pre-test and polygraph examination. The failure to record Slonka’s examination, therefore, raises serious questions regarding the procedures employed, the legitimacy of the test and the parties intentions, or good faith, in conducting the test.

The examination was conducted after Slonka had already been convicted and Incarcerated for more than five years. He had been released from incarceration after his appeal had been allowed. The examination was a confidential and privileged exam, meaning that the results if unfavorable would not be disclosed to anyone. The Commonwealth had no advance notice that the exam was to be administered and there was no agreement that the results could be admitted into evidence whatever the outcome. Slonka’s responses to pre-test questions concerning his physical and mental condition, his medications and the like were taken at face value. There was no attempt to determine through investigation whether Slonka had taken previous polygraphs or had received training while incarcerated, or at some other time, regarding countermeasures. These unique circumstances raise additional concerns about the reliability of the examination.

II. STEWART ANALYSIS

In Commonwealth v. Stewart, 422 Mass. 385, 389 (1996), the Supreme Judicial Court stated as dictum that “(i]f polygraph evidence is to be admissible in a given case, it seems likely that its reliability will be established by proof in a given case that a qualified tester who conducted the test had in similar circumstances demonstrated, in a statistically valid number of independently verified and controlled tests, the high level of accuracy of the conclusions that the tester reached in those tests.” See also Commonwealth v. Duguay, 430 Mass. 397 (1999). The evidence in this case is insufficient to satisfy Slonka’s burden in this regard. Johnson is a qualified examiner. His accuracy, however, has not been demonstrated in similar circumstances through independently verified and controlled tests.

The defendant purports to establish Johnson’s accuracy by the Northeastern University Study performed during January into March 1999. Johnson’s exam of Slonka was performed on March 16, 1999. This study suffers from a number of problems. The study was commissioned by CPCS in anticipation of litigation in Commonwealth v. Stephen Henry, then pending in the Middlesex Superior Court. CPCS, intending to employ Johnson to perform the polygraph examination on Stephen Henry, sought to establish his accuracy to comport with requirements of Stewart. A CPCS attorney, Neni Odiaga, was responsible for producing the study for CPCS and participated to some extent in the planning of the study. Johnson was substantially involved in designing the study with Prof. Levin and Janese Free. Funding for the study was obtained by CPCS through the court. The study was never published, or even submitted for publication. The level of CPCS participation, combined with the facts that the study was intended for use in a pending criminal matter and was funded by the court in that matter, that Johnson himself (the person to be tested) helped design the manner in which he would be tested and that the study has never been submitted for publication, leaves the court unpersuaded of the “independence” of the study.

More importantly, however, the study clearly does not test Johnson’s accuracy in “similar circumstances.” All one has to do is compare Slonka’s situation as of March 16, 1999, with that of the subjects in the Northeastern study. Slonka had been convicted by a jury on March 13, 1992, of armed assault with intent to murder and assault and battery by means of a dangerous weapon. He was sentenced to 15 to 20 years in MCI Cedar Junction on the more serious charge. On May 30, 1997, his conviction was reversed by the Appeals Court due to an error in the judge’s reinstruction to the jury on reasonable doubt. He posted bail on Oct. 15, 1997, and has been awaiting retrial ever since. The crime itself was committed on April 28, 1991. It involved the brutal stabbing of a prostitute with what appeared to be hedge shears. The identity of the defendant as the perpetrator of the crime was a key issue in the case. Only the victim identified him as her attacker and there were issues about the strength and reliability of that identification.

By contrast, the subjects in the Northeastern study committed a mock, nonviolent crime, which was witnessed not only by the victim, but also by Janese Free and which was videotaped. After commission of the mock crime, the subject was required to turn in the stolen property and to sign a receipt for it. The subject’s identity as the perpetrator, therefore, was known to a certainty and the perpetrator knew that it was known. The subjects were students who faced no consequences for their conduct other than the possible loss of a $50 bonus. There is nothing to suggest that the students had ever been incarcerated or had ever committed any crime before or had ever been under suspicion of having committed a crime. Within fifteen minutes of having committed the mock crime the students were examined. There was no opportunity between commission of the mock crime and examination to obtain education about the polygraph or on countermeasures successfully to defeat the polygraph. There was no opportunity for the passage of time to affect the subject’s memory in any way or to permit any other psychological factors to alter the subject’s memory of the mock crime. The entire pretest, as well as the test itself, was videotaped.

It is obvious from a mere recitation of the differences between Slonka’s situation and that of the subjects of the study that there is no similarity at all. No persuasive scientific evidence or studies were presented demonstrating that these substantial differences would not have a significant effect on the polygraph examination, either alone or in combination. Since the defendant has not met his burden of proving Johnson’s accuracy in a real life situation similar to the defendant’s, the requirements of Stewart have not been met and that the results of the polygraph examination must be excluded.,

Slonka’s request for the admission of a future polygraph examination by Johnson or by someone else, suffers from many of the same problems. To begin with, it is difficult to conceive of how the accuracy of tire examiner could be established in substantially similar circumstances. Any such attempt to qualify the examiner would seem doomed to fail for the same reasons that the Northeastern study has failed. In addition, there would be no meaningful way to ensure that Slonka had not learned mental countermeasures which would render the examination meaningless. He has had ample opportunity to do so and ample incentive to do so. Perhaps most importantly of all, Slonka has already taken a polygraph examination. Both Dr. Honts and Johnson have acknowledged that the significance of a prior polygraph examination has been untested and is unknown. Indeed, both were so concerned about this and about the effect of education regarding the polygraph on any future polygraph examination of Slonka, that they strongly recommended that Slonka leave the courtroom during substantial portions of their testimony, which he did. With so much unknown about the effect of a prior polygraph, the extent of Slonka’s knowledge of polygraphy, Slonka’s training in and knowledge of countermeasures, the passage of time as it might affect Slonka’s memory or his perception of past events, as well as the inability to replicate in any meaningful way polygraph examinations of subjects in Slonka’s real life situation, the reliability of any future polygraph examination of Slonka cannot be established and the results of any such examination, if conducted, would have to be excluded.

III. OTHER CONSIDERATIONS

Even if the polygraph examination results were admissible under Lanigan and even if the requirements of Stewart could be met, the court, exercising its discretion, would exclude the polygraph examination of Slonka previously conducted, as well as any future polygraph examination in this case. The task of assessing the credibility or witnesses is committed to the jurors who are uniquely suited to resolving issues of credibility. Sizing up people and deciding whether to believe them or how much weight to give their statements or testimony is something we all engage in daily. It is readily within the realm of the jurors’ understanding. The need for expert assistance is questionable, particularly where such expert assistance is itself of dubious reliability.

“It is a fundamental principle that ‘a witness cannot be asked to assess the credibility of his testimony or that of other witnesses.’ ” Commonwealth v. Triplett, 398 Mass. 561, 567 (1986), quoting Commonwealth v. Dickinson, 394 Mass. 702, 706 (1985). “Evaluations of credibility are, of course, within the exclusive province of the trier of fact.” Commonwealth v. Ianello, 401 Mass. 197, 202 (1987), quoting Commonwealth v. Bohannon, 376 Mass. 90, 94 (1978). Recently, our appellate courts have emphasized the importance of this principle by the extraordinary care taken to preclude expert witnesses when providing profile evidence or syndrome evidence from expressing opinions which might suggest to the jury that the expert is vouching for the credibility of the victim or that the defendant or the victim fits a particular profile. See e.g. Commonwealth v. Federico, 425 Mass. 844, 848-50 (1997); Commonwealth v. Richardson, 423 Mass. 180, 186 (1996); Commonwealth v. Trubridge, 419 Mass. 750, 759 (1995); Commonwealth v. Goetzendanner, 42 Mass.App.Ct. 637, 644 (1997); Commonwealth v. LaCaprucia, 41 Mass.App.Ct. 496, 497 (1996).

Polygraph evidence violates this fundamental principle. It constitutes opinion testimony by a purported expert that the person tested is truthful, or was at least truthful at the time of the test. The jury is asked to infer from that the defendant was truthful when testifying in court or when denying culpability in an out of court statement. Proponents of the polygraph suggest that such testimony differs from other opinion testimony on the credibility of a witness, because it is an opinion based solely on physical characteristics of the examinee. They attempt to liken it to other physical tests such as finger printing, blood tests or DNA testing. Polygraph examinations however are not merely tests of physical characteristics. They are tests that examine physical changes, equivocal in and of themselves, from which the examiner attempts to draw conclusions about the actual thoughts or state of mind of the subject based upon not only the physical changes but also on structure of the examination, the pre-test and test questions and personal observation of the subject. The physical changes themselves are dependent upon proper preconditioning of the subject’s mind during the pre-test. It is impossible to separate the physical reactions from the interaction between the examiner and the subject and it is impossible to separate the interpretation of the physical reactions from the examiner’s personal opinion of the adequacy of the pre-test, the setting for the test, the questions selected for the test and the conduct of the test itself, as well as the examiner’s subjective impressions of the subject. The testimony of the polygraph examiner therefore is much more like the testimony of other psychiatric experts or other types of experts which is consistently excluded if it may be perceived as vouching for the credibility of the witness and therefore intruding into the exclusive domain of the jury.

Polygraph evidence of the type proposed is also hearsay. The defendant offers his out of court statement for the purpose of permitting an expert to testily that the statement was true, or more precisely that the polygraph results were consistent with a truthful state of mind. If the statement were offered by the defendant directly, it would clearly be inadmissible. “[I]f a defendant is charged with a crime and unequivocally denies it, that denial is not admissible in evidence.” Commonwealth v. Nawn, 394 Mass. 1, 4 (1985). “Unless it is admissible under a different theory, a criminal defendant’s denial of an accusation should be excluded as hearsay.” Commonwealth v. Henry, 37 Mass.App.Ct., 429, 432-33 (1994). No argument is made in this case that the statement is admissible to rebut a claim of recent contrivance. To the extent the defendant's statement is offered as the basis of the polygraph examiner’s opinion, the statement is likewise inadmissible, as is the opinion based upon it. See DYS v. A Juvenile, 398 Mass. 516, 531 (1986).

It is suggested by the defense that polygraph evidence should be admitted for the limited purpose of corroborating the defendant’s trial testimony. A defendant’s out of court affirmance of his innocence has not previously been admissible in this Commonwealth for purposes of corroborating his trial testimony. See Commonwealth v. Nawn, supra. If a defendant could not himself offer such testimony, the testimony of a proffered expert that such out of court statement displays indicia of truthfulness should likewise be excluded.

The defendant also argues that the repetition of his out of court statement by the polygraph examiner in the course of expressing his opinion regarding the defendant’s consciousness of innocence is not hearsay. The argument is that the words spoken by the defendant out of court are not offered for their truth, but simply for the fact that they were spoken and triggered certain physical responses measured by the polygraph machine from which the polygraph examiner is able to form an opinion that the physical responses are consistent with a consciousness of innocence. There are multiple problems with this argument. First, the defense has failed to cite a single case in this Commonwealth permitting the defendant’s statement of innocence to be introduced as evidence of consciousness of innocence. The court itself has found no such case. If the defendant could not himself testify to his consciousness of innocence, expert testimony that the defendant’s physical reactions when professing his innocence are consistent with consciousness of innocence should also be excluded. Second, as noted earlier, expert opinion regarding the truthfulness of a witness is not admissible in this Commonwealth. Characterizing the polygraph examiner’s testimony as an opinion regarding consciousness of innocence instead of an opinion regarding truthfulness should not change the result. Third, as noted previously, the polygraph examiner’s opinion is not derived solely from the physical changes recorded by the machine. Such changes in and of themselves cannot establish the defendant’s state of mind. Rather, it is the polygraph examiner’s interpretation of those results based upon the pre-test, the questions selected as control questions and the responses to both the control questions and the relevant questions, as well as the polygraph examiner’s opportunity to observe the subject throughout the pre-test and test, upon which the polygraph examiner’s opinion is based. The polygraph examiner therefore is not testifying to an opinion derived solely from the fact that there were certain physical reactions to a particular statement by the defendant. The examiner is testifying to an expert opinion based in part, but not entirely, on such evidence. No compelling reason has been presented to treat such testimony any differently from the testimony of other types of experts. Indeed, in light of the dubious reliability of polygraph examinations, there is compelling reason not to do so.

Finally, introduction of polygraph evidence would divert juror attention from the primary issues of the case to a collateral issue regarding credibility and would divert substantial time and financial resources to the same issue. This case is a perfect example of the time and costs involved. The defense had to file a motion in limine for permission to use polygraph evidence. Both parties had to file extensive memo-randa. Three days were consumed in an evidentiary hearing. The defense called two expert witnesses, as well as the polygraph examiner. The Commonwealth chose to rely on cross-examination of these witnesses rather than to call its own experts. Thirty-one exhibits were introduced during the hearing. The court had to absorb all of the testimony of the witnesses, review all of the exhibits and all of the memoranda. It then had to prepare this opinion. If the court were to allow the motion, it is likely that most, if not all, the testimony already heard by the court will be presented to the jury. It is also likely that the Commonwealth will call experts of its own to explain to the jury the considerable problems in relying on polygraph examinations, particularly in this case. The Commonwealth will be required to do so, because polygraph evidence if believed by the jury, is in all likelihood outcome determinative. The Commonwealth, therefore, will be compelled to make an all-out effort to persuade the jury to disregard the polygraph examination results. In the typical case, furthermore, the defense will have to incur the expense of conducting a study to establish the accuracy of the polygraph examiner. In sum, introduction of such evidence poses substantial risk of causing the trial to descend into a time-consuming and very expensive “battle of experts on the probative value of the polygraph test rather than a determination of the guilt or innocence of a defendant.” Commonwealth v. Vitello, 376 Mass. 426, 442 (1978), quoting Commonwealth v. Fatalo, 346 Mass. 266, 269 (1963).

ORDER

For all of the above reasons, Defendant’s motion to admit polygraph evidence is DENIED. 
      
      Numerous exhibits were attached to Slonka’s post trial memorandum. Those exhibits were not introduced into evidence and have been considered, therefore, solely for their citation to case law and law review articles.
     
      
      In his motion, Slonka argued that he had a constitutional right to present polygraph evidence. This argument was not mentioned in his post trial memorandum. It is clear that no such right exists under either the Constitution of the United States or under the Constitution of this Commonwealth. See United States v. Scheffer, 118 S.Ct. 1261 (1998): Commonwealth v. Mendes, 406 Mass. 201 (1989).
     
      
      Since Mendes, two courts in jurisdictions that had not previously admitted polygraph test results as evidence in criminal trials have now done so. In U.S. v. Galbreth, 908 F.Sup. 877, 890-96 (D.N.M. 1995), a tax evasion case, the court, applying Daubert, concluded that polygraph test evidence was sufficiently reliable to be admitted into evidence when the test was administered by a properly qualified and validated examiner. Relying heavily on the Gallup and Amato surveys (see infra), the court found that polygraph testing was generally accepted within the relevant scientific community and that it satisfied the other Daubert factors as well. In U.S. v. Crumby, 895 F.Sup. 1354, 1361-65 (D.Az. 1995), a bank robbery case, the court found polygraph testing to be sufficiently reliable under Daubert to be admissible when (1) prosecutor received sufficient notice and was given a reasonable opportunity to administer its own test; (2) the evidence was admitted solely to impeach or corroborate the defendant’s trial testimony as evidence of the defendant's character for truthfulness under Fed.R.Ev. 608(a) and (3) only the fact that the test was given and passed was admitted, not the specific questions asked during the test or the answers thereto or the specific physiological responses or data. Surprisingly, the court concluded that, after Daubert, general acceptance within the relevant scientific community was not a particularly significant factor, but that the defendant, nevertheless, had “shown that polygraphy is accepted by polygraphers and its use is endorsed by a number of organizations." Id. at p. 1360. The court cites only Amato's 1993 survey to support this conclusion. Other courts, like this Commonwealth in Stewart, have abandoned a per se rule of exclusion since Daubert, but did not go on to determine the admissibility of polygraph evidence. U.S. v. Cordoba, 104 F.3d 225, 227-29 (9th Cir. 1997); United States v. Redschlag, 971 F.Sup. 1371, 1374 (D.Colo. 1997). Some courts have admitted evidence of the fact and circumstances of the polygraph examination solely to support or rebut a defendant’s claim that his confession was coerced or was otherwise unreliable. People v. Jefferson, 705 N.E.2d 56, 599-62 (Ill. 1998); People v. Mellock, 599 N.E.2d 941, 960 (Ill. 1992); People v. Jackson, 556 N.E.2d 619, 628-29 (Ill. 1990); and State v. Schaeffer, 457 N.W.2d 194, 195 (Minn. 1990). Finally, at least one state has moved to admitting polygraph test results where all parties have stipulated to its admissibility. State v. Satter, 543 N.W.2d 249, 255 (S.D. 1996).
     
      
      At the hearing, Dr. Honts identified a fourth post -Mendes study which he neglected to include in his article. His failure to do so is surprising, since this study supposedly had the highest accuracy rate of all. Although such a study would raise the average accuracy rate, it does little to explain the substantial variations from study to study.
     
      
      Attorney Odiaga also assisted Slonka's attorney through the hearing in the instant case.
     
      
      There are other concerns about the Northeastern study. Johnson said that his rate of inconclusives over the last ten years has been about five percent. This year he has had one inconclusive in eighty examinations. Yet he had three incon-clusives out of nineteen valid tests in the Northeastern study. No explanation was offered for this. Janese Free attended some of the examinations. She knew whether the subject had committed the mock crime. This raises the possibility that she may unintentionally have provided Johnson clues to the subject's veracity. Free also knew what Johnson's findings were with respect to each subject as the study progressed. Dr. Honts admitted it would be preferable to have kept those results secret until all examinations had been conducted.
     
      
      Johnson, Levin, Free and Dr. Honts also testified in a preliminary hearing in Commonwealth v. Henry, on the admissibility of polygraph evidence. Judge Zobel excluded all such evidence from the trial.
     