
    Joel L. GIER, By and Through His Parents and Next Friends, Jack L. Gier and Mary E. Gier; Amanda Petska, By and Through Her Parents and Next Friends Roger D. Petska and Kathleen R. Petska; Lisa L. Kummer, By and Through Her Parent and Next Friend, Theresa M. Kummer; Heather A. Duncan, By and Through Her Parents and Next Friends, James P.L. Duncan and Kathryn J. Duncan; Thomas W. Lute, By and Through His Parents and Next Friends, Lewis O. Lute and Carol V. Lute; Kelly V. Christiansen, By and Through Her Parent and Next Friend, Virginia L. Christiansen; and Harold Lutkehus II, By and Through His Parent and Next Friend, Connie K. Lutkehus, Plaintiffs, v. EDUCATIONAL SERVICE UNIT NO. 16; Marge Lehman and Marge Beatty, Defendants.
    No. 7:CV92-5000.
    United States District Court, D. Nebraska.
    Feb. 2, 1994.
    James R. Welsh, Bradford, Coenen Law Firm, C. Gregg Larson, Omaha, NE, for plaintiffs.
    Brian D. Nolan, Stephen P. Ingram, Hansen, Engles Law Firm, Omaha, NE, William T. Wright, Jacobsen, Orr Law Firm, Kearney, NE, for defendants.
    James L. Hatheway, Nebraska Dept, of Social Services, Lincoln, NE, for Department of Social Services, State of Neb., interested party. ’
   MEMORANDUM AND ORDER

PIESTER, United States Magistrate Judge.

Pending before the court is defendants’ motion in limine (filing 282) to exclude the testimony of certain expert witnesses pursuant to FedR.Evid. 702. See Daubert v. Merrell Dow, — U.S.-, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993). A hearing was held on the motion on Monday January 31, 1994. See United States v. Martinez, 3 F.3d 1191 (8th Cir.1993) (directing district courts to hold such hearings on Daubert issues). For reasons discussed more fully below, I shall grant the motion, in part.

Defendants seek to exclude certain testimony of a psychiatrist, Dr. Scanlan, and two psychologists, Drs. Sullivan and Jones, who examined the plaintiffs, seven mentally retarded individuals. As a result of the examinations, the witnesses determined plaintiffs had been sexually, physically and emotionally abused. Each of the plaintiffs attended the school of the state-established Educational Service Unit No. 16 (ESU 16). Plaintiffs intend to offer the testimony of these witnesses as experts in the detection of such abuses; the testimony will be introduced as substantive evidence that the plaintiffs were physically, sexually and emotionally abused and that the abuse took place at ESU 16. Further, the plaintiffs intend to introduce the opinions of the witnesses that the plaintiffs suffered permanent psychological injuries because of the abuse.

DAUBERT INQUIRY

In Daubert, the Supreme Court set forth a two-part analysis to be utilized by the trial court when considering the admissibility of expert testimony. The focus of the court’s inquiry must remain on whether the proffered testimony is both relevant and reliable. Daubert, — U.S. at-, 113 S.Ct. at 2975. The Court stated that admissible expert testimony must be based on more than “subjective belief or unsupported speculation.” Id. In the ease at bar, the testimony of the three experts will be inadmissible unless it is “derived by the scientific method” and “supported by the appropriate validation.” Id. In satisfying the reliability portion of the test, plaintiffs must establish by a preponderance of the proof that “the reasoning or methodology underlying the testimony is scientifically valid and [] that reasoning or methodology properly can be applied to the facts in issue.” Id., — U.S. at-& n. 10, 113 S.Ct. at 2796 & n. 10.

To assist in the reliability analysis the Court set forth a non-exhaustive list of issues to be considered. As applied to the particular case, a “key question” is precisely what was the methodology or technique employed by each expert witness for each plaintiff and whether that methodology or technique is capable of being falsified, that is, whether it actually employs the “scientific method.” Id., -U.S. at---, 113 S.Ct. at 2796-97. Further, a consideration in this instance will be whether the particular methodology or technique employed by the experts was subjected to peer review; its potential or known rate of error; and whether it has attained “general acceptance” within the scientific community. Id., — U.S. at ——, 113 S.Ct. at 2797. Of course, as noted by the Court, the inquiry regarding the reliability is to be based “solely on principles and methodology, not on the conclusions that they generate.” Id.

Numerous cases and commentary have considered the admissibility of psychological testimony that a particular child was abused, whether it be physically, sexually or emotionally. See generally, Myers, et al., Expert Testimony in Child Sexual Abuse Litigation, 68 Neb.L.Rev. 1 (1989) (Myers); McCord, Expert Psychological Testimony About Child Complaints in Sexual Abuse Prosecutions: A Foray into the Admissibility of Novel Psychological Evidence, 77 J.Crim.L. & Criminology 1 (1986). The psychological determination is “vitally important.”

Determining whether a child has been sexually abused is a matter of great importance. If this judgment is wrong, a child’s physical and emotional health may be permanently jeopardized, additional children needlessly abused and their families and communities traumatized. Just as important, an individual’s reputation, access to and custody of children, and even liberty, may be lost over false accusation. Children’s recovery from the effects of child abuse, the protection of the community and the protection of innocent persons depends on accurate decision making.

Berliner, Deciding Whether a Child has been Sexually Abused, in Sexual Abuse Allegations in Custody and Visitation Cases 48 (B. Nicholson & J. Bulkley eds. 1988), quoted in Myers, supra, at 71.

The difficult task of considering the admissibility of the particular evidence is made more difficult in this case, however, by a number of factors. First, a typical case will contain the evidentiary issue with one child, one expert and one type of abuse. Here, a decision must be made for three experts, seven children and three separate types of abuse. Further, few decisions on the issue have been decided after the Supreme Court in Daubert set forth the proper analysis to be followed under Rule 702. See State v. Foret, 628 So.2d 1116 (La.1993); State v. Cressey, 137 N.H. 402, 628 A.2d 696 (1993) (both applying state versions of Fed.R.Evid. 702.) Finally, the plaintiffs in this ease are all mentally retarded with extremely limited communications skills, conditions not explicitly discussed by most cases and commentators.

PLAINTIFFS’ EVALUATIONS

To fully analyze the reliability of the techniques of the witnesses in this case, it is necessary to describe those techniques and the resulting bases for the witnesses’ conclusions about the purported abuse suffered by the plaintiffs. Five of the plaintiffs were examined at the Boys Town National Institute (Boys Town). Sullivan and Scanlan are part of the team who performed the evaluations. The protocol followed by the evaluation team was developed by Sullivan. That protocol is^ essentially a three-step process.

First, the child’s history is taken from the parents and possibly from social services. (Scanlan depo. at 32). The second step is to take a psychological evaluation. That evaluation is made through standardized tests; the most-used test for the plaintiffs was the Child Behavior Checklist (CBC). The child’s parents fill out the CBC based on the behaviors they had observed. (Id. at 37:16-39:1). The final step is to conduct a clinical interview. This interview is performed by eounselors at a master’s degree level of education. Through their affidavits, Sullivan and Seanlan indicated that a protocol has been developed for these interviews. The protocol allows the interviewer to employ anatomieallycorrect dolls and art or graphic materials to gain information from the child. Scanlan stated that the interviewer has freedom to determine how best to conduct an interview for a particular child. (Id. at 41:21-42:13).

Doctor Jones works as a psychologist in private practice in Colorado. She examined plaintiffs Lute, Duncan and Petska. She became aware that other children from ESU 16 were being examined for abuse by Boys Town after reaching her conclusions, and none of her conclusions was affected by the Boys Town conclusions. In an affidavit she stated that she employs the protocol designed by Sullivan. However, in deposition testimony, she stated that she reached the conclusions that Lute was abused after providing psychotherapy rather than an actual evaluation and assessment. She stated that psychotherapy includes some informal assessment. She has not explained precisely what protocol she followed to make such an assessment through psychotherapy.

Because of the number of plaintiffs and the number of witnesses involved, it would cause this memorandum to become unduly lengthy to describe in detail the evaluations conducted for each plaintiff and the bases for the witnesses’ opinions. Although I have examined the procedures used for each evaluation and the bases for each opinion, I shall discuss only some of the evaluations. Particularly, I shall describe the Boys Town evaluation of plaintiff Gier, and Jones’ evaluations of plaintiff Lute.

1. Joel Gier

Plaintiff Gier went to Boys Town in 1987 for a regular, triennial examination to determine his overall functioning level. Sullivan performed the evaluation. At that time he was seven years old, but functioned at the level of a four-year old. At the end of the session, Joel stood from his chair and Sullivan noticed bruises around each arm above the elbows. She asked Joel’s parents about the bruises. His parents indicated that Joel had returned from school with the bruises. As a result of the conversation, Sullivan determined Joel should be scheduled for an evaluation to determine whether he had been abused.

As part of the evaluation, his parents filled out a CBC which indicated that Joel was engaging in some sexualized behavior. During depositions Sullivan was unable to locate that CBC in the Boys Town medical file. Two clinical interviews were conducted with Joel; Maureen Mann, a non-Ph.D. counselor at Boys Town conducted the interviews. Sullivan viewed portions of the interviews from behind a one-way mirror. The summaries in the medical file describing Joel’s behavior were made by Mann.

Based on the clinical interviews, the psychological evaluation, her observation of the bruises on Joel’s arms and her discussions with Joel’s parents, Sullivan reached the opinion that Joel had been physically, sexually and emotionally abused as ESU 16. The bases for the opinion about physical abuse were the bruising, the parents’ statements that Joel had come home from school with the bruises, Sullivan’s belief that the parents were telling the truth and were not abusing Joel, the mother’s statement that Joel was afraid of school, and Joel’s behavior during clinical interviews role playing in which he ran to a corner of the room and balled up in the fetal position when a particular doll was labelled as a teacher. Also, during the interviews, Joel used the “teacher” doll to hit and restrain children dolls. From this information Sullivan reached the conclusion that Joel had been physically abused by the particular teacher which had been depicted by the doll.

Sullivan also reached the conclusion that Joel had been sexually abused at ESU 16, although she was unable to conclude what the abuse consisted of or whether it was a staff member or older student who performed the act constituting abuse. The opinion was based on what she described demonstrations of “sexualized behavior.” She stated that no specific mention was made in the Boys Town files of particular actions observed by any staff members. Joel’s mother did report that Joel was “humping” stuffed animals and siblings. Sullivan also concluded that Joel had been emotionally abused at ESU 16. This conclusion was based on the physical and sexual abuse which she concluded had taken place.

2. Tommy Lute

Jones first saw Tommy Lute on May 12, 1988. Jones first met with Tommy’s mother who stated that he was having nightmares which were causing problems at his school. At that time, Tommy was not attending ESU 16; his family had moved to Colorado where he was “mainstreamed” in a high school. Jones stated that she did not formally evaluate Lute at that time; rather, she provided psychotherapy for him. In the process of the psychotherapy, which lasted over a considerable amount of time, she came to conclusions about abuse. Other than an affidavit which appears to be inconsistent with her deposition testimony, she does pot state explicitly how she undertook the sessions. Her interviews were not recorded, but some verbatim comments were provided to counsel; those comments quote Tommy communicating largely in one- or two-word phrases.

Jones based her conclusions that Tommy was sexually abused at ESU 16 on the following: 1) during the first session, Tommy removed the pants from a male doll and touched its penis, stating that some teacher whose name Jones could not recall with certainty had touched Tommy’s penis; 2) during the therapy, Tommy knew the meaning of “french-kiss”; 3) Tommy’s statements that (a) “Sharon” had touched his genitals; (b) Mr. Easton had shown a movie with a “sexual theme”; and (e) “Sharon” “had sex” with Tommy numerous times; 4) Tommy’s general anxiety; 6) his separation anxiety from his mother; 7) increased interest in his own body than should a child of his age and developmental level; 8) anxiety, stuttering and stammering when talking about sex and ESU 16; 9) his mother’s reports of poor functioning while at ESU 16; and 10) signs of post-traumatic stress disorder, including nightmares and the tendency for certain stimuli (including the mention of devils or demons or the sight of Mr. Easton) to cause Tommy to become aggressive or angry.

She based her conclusions that Tommy was physically abused on 1) Tommy’s reports of being struck by certain teachers; 2) his general fear when others approach, fearing they might strike him; and 3) his angry handling of dolls. Regarding the emotional abuse, Jones based her conclusions on Tommy’s: 1) anxiety, stuttering and general fearfulness; 2) verbalizations about what he did not like at* ESU 16; 3) difficulties when his current school reminded him of ESU 16; 4) verbalizations that he was afraid; and 5) the presence, in her opinion, of physical and sexual abuse.

GENERAL QUESTIONS OF ADMISSIBILITY

As stated earlier, numerous courts and commentators have expressed concerns about the reliability of the type of expert testimony proffered by plaintiffs in this case. The underlying suspicions of such testimony is succinctly captured by the New Hampshire Supreme Court: “Generally speaking, the psychological evaluation of a child suspected of being sexually abused is, at best, an inexact science.” State v. Cressey, 628 A.2d at 699. The conclusions of Drs. Scanlan, Sullivan and Jones regarding the emotional and physical abuse of the plaintiffs is based on procedures identical to those employed for the conclusions that the children were sexually abused. In that respect, the evaluations for sexual, physical and emotional abuse are inseparable. The suspicion expressed regarding evaluations for sexual abuse are equally applicable to the evaluations for physical and emotional abuse.

Few of the witnesses’ bases for their opinions that plaintiffs were abused stem from particular allegations of the plaintiffs that an individual at ESU 16 performed a particular action. A conclusion that a child was abused may not be based solely on the allegations of the child. United States v. Whitted, 11 F.3d 782, 786 (8th Cir.1993). Accord State v. Cressey, 628 A.2d at 700 (to be admissible the conclusions must be “at least partly based on factors in addition to and independent of the victim’s accounts; otherwise, the expert’s conclusions are of no value to the jury because they present no new evidence and are merely vouching for the credibility of the child victim witness.”) See also, United States v. Azure, 801 F.2d 336 (8th Cir.1986) (expert witness may not vouch for credibility of child victim).

In the quest for independent confirmation of suspicions of abuse, psychiatrists and psychologists generally, and in this case particularly, seek to compare the behavior or “symptoms” of the child to that of victims of child abuse. From the standpoint of evidentiary reliability, that quest is fraught with danger. I share the concern expressed in Cressey that

the evaluations of the children deal almost exclusively in vague psychological profiles and symptoms, and unquantifiable evaluation results. There is much criticism attacking the attempts to compile a list of symptoms and behaviors to serve as an accurate indicator of whether a child has been sexually abused. See Commonwealth v. Dunkle, 529 Pa. 168, 602 A.2d 830, 832-36 (1992) (citing articles); State v. J.Q., 252 N.J.Super. 11, 33-35, 599 A.2d 172, 184-85 (1991) (citing articles), aff'd, 130 N.J. 554, 617 A.2d 1196 (1993); State v. Rimmasch, 775 P.2d 388, 401-02 (Utah 1989) (citing articles)[.]

State v. Cressey, 628 A.2d at 700; see also, Myers, 68 Neb.L.Rev. at 67-68. Even in terms of non-retarded children, “The consensus among scholars is that there are as yet no scientifically reliable indicators of child sexual abuse.” State v. J.Q., 252 N.J.Super. at 33, 599 A.2d at 184, quoted in State v. Cressey, 628 A.2d at 700.

A further “inherent” difficulty in the use of this type of psychological evidence is that it “is essentially ‘irrefutable’, as the only way to test it is by proposing theoretical explanations for behavior and then testing the theories on patients.” State v. Foret, 628 So.2d at 1125, citing Morse, Failed Explanations and Criminal Responsibility: Experts and the Unconscious, 68 Va.L.R. 971, 995 (1982). The difficulty in refuting the evidence is borne out by the resulting inability to cross-examine the witness’ opinion. As in most cases, the witnesses’ opinions are based on an interpretation of all of the factors before them as opposed to a single indicator or symptom. Again, as expressed by Cressey,

So even though the defendants may be . able to discredit several of the indicators, symptoms or test results, the expert’s overall opinion is likely to emerge unscathed. An expert using this methodology may candidly acknowledge any inconsistencies or potential shortcomings in the individual pieces of evidence she presents, but can easily dismiss the critique by saying that her evaluation relies on no one symptom or indicator and that her conclusions still hold true in light of all the other available factors and her expertise in the field. In such a case, the expert’s conclusions are as impenetrable as they are unverifiable.

State v. Cressey, 628 A.2d at 701.

One of the primary difficulties facing a court considering the admissibility of such evidence is described by the Pennsylvania Supreme Court in Commonwealth v. Dunkle, 529 Pa. 168, 602 A.2d 830 (1992).

The principal flaw with the notion of a specific syndrome is that no evidence indicates that it can discriminate between sexually abused children and those who have experienced other trauma.... In order for a syndrome to have discriminant ability, not only must it appear regularly in a group of children with a certain experience, but it also must not appear in other groups of children who have not had that experience.
According to the literature on the subject, there is no one classical or typical personality profile for abused children. The difficulty with identifying a set of behaviors exhibited by abused children is that abused children react in a myriad of ways that may not only be dissimilar from other sexually abused children, but may be the very same behaviors as children exhibit who are not abused.

Id., at 174, 602 A.2d at 832 [footnotes omitted]. After reviewing the relevant literature on the subject, the court, applying the Frye test, stated, “it is clear that the testimony about the uniformity of behaviors exhibited by sexually abused children is not ‘sufficiently established to have gained general acceptance in the particular field in which it belongs.’ ” Id., 529 Pa. at 176-77, 602 A.2d at 834, quoting Commonwealth v. Nazarovitch, 496 Pa. 97, 101, 436 A.2d 170, 172 (1981); see also, State v. Rimmasch, 775 P.2d 388, 401 (Utah 1989) (finding the literature disparate and contradictory and noting an inability of experts to agree upon a universal symptomology for sexual abuse).

While the general nature of psychological testimony in abuse cases casts suspicion on its admissibility after Daubert, my decision in this ease does not cast so broad a net as to encompass all such testimony. Rather, my concerns about the specific testimony proffered in this particular case lead to the conclusion that plaintiffs have not demonstrated by a preponderance of the evidence, Daubert, — U.S. at- & n. 10, 113 S.Ct. at 2796 & n. 10, that the techniques employed in determining that the plaintiffs were abused are sufficiently reliable to satisfy Rule 702. Although the general concerns regarding this type of testimony remain important, plaintiffs’ failure to demonstrate the reliability of the particular methodologies of the doctors when used to determine whether these mentally retarded individuals were abused is also central to my decision.

RELIABILITY OF EXPERTS’ METHODOLOGY

During the hearing on the Daubert issues, plaintiffs submitted affidavits from Sullivan, Scanlan and Jones. The affidavits are essentially identical. Generally, they assert that the evaluations of each child were conducted according to the protocol described in exhibits 16 and 38 of Sullivan’s deposition testimony. They assert this protocol has been subjected to peer review for use with non-retarded children. They allege that,

the professional literature (see, Perlman, N & Ericson, K (1992)) has recognized that the same basis strategies can be used successfully with mentally retarded children. No empirical methods of testing for abuse exist within the scientific community.

(See Scanlan aff. at 4). Further, the affidavits concede that psychological conclusions are not susceptible to “proof’ in a sense that proof is possible with the physical sciences.

At the hearing defendants submitted the affidavit of Dr. Jack Stark, a psychologist and director of medical psychology with the Physicians Clinic of Omaha. Stark’s criticism of the plaintiffs’ proffered evidence is that the methodologies employed fail to account for the plaintiffs’ mental retardation. He correctly indicated that none of the plaintiffs’ witnesses professes to have any expertise in the field of mental retardation. He asserts that without sufficient training and experience in the field of mental retardation, a professional may misunderstand the behavior and misinterpret the actions of a mentally retarded child. Such incorrect observations and interpretations will likely lead to unduly speculative conclusions, he states.

Stark criticized the methods employed by Jones, Sullivan and Scanlan for failing to account for the plaintiffs’ mental retardation. He states that the plaintiffs’ witnesses made an incorrect extrapolation by comparing the behavior of mentally retarded children to the model of abused non-retarded children. He describes as a “fallacy” the generalization of behaviors of retarded and non-retarded children. Stark further criticized the methodology utilized by plaintiffs’ witnesses, stating that when used for retarded children it had never been subjected to peer review. He stated that the CBC had never been validated for use on mentally retarded persons, and he argued that the use of anatomical dolls was improper. Stark concluded by indicating that certain standards to be used in the evaluation of mentally retarded persons had been established by professional associations and were in use throughout the country but were not utilized by plaintiffs’ witnesses in their evaluations of plaintiffs. Those standards included, inter alia the use of persons trained and experienced in mental retardation, the use of test instruments and interview techniques validated on mentally retarded individuals.

I am troubled by plaintiffs’ essentially unsubstantiated conclusion that research and data about non-retarded children is equally applicable to retarded children. Logically, differences could be apparent in the behavior of retarded and non-retarded children. While defendants have alluded to such differences in the deposition examinations of the ■witnesses and asserted that some instruments employed by Boys Town have not been validated for use with retarded children, they have not gone so far as to present evidence as to what the differences are and how they so infect the research on non-retarded children as to make it completely unreliable in this action. I cannot conclude the retárded/non-retarded dichotomy alone renders the witnesses’ testimony inadmissible; however, it has relevance when considering the clinical techniques and conclusions of plaintiffs’ witnesses.

1. Psychological Evaluation

The witnesses’ conclusions were based, at least in part, on the use of Child Behavior Checklists (CBC). While this test is concededly standardized and accepted as reliable for the use of non-retarded children, defendants assert that use of the CBC for retarded children is unreliable. Defendants presented evidence that use of the CBC has not been validated for use with mentally retarded children. Plaintiffs presented no evidence to the contrary, and in fact Dr. Sullivan acknowledged in her deposition that the CBC has not been validated in tests of retarded children, even though her affidavit appears to assert otherwise.

Further, as a more general objection, the CBC is insufficient, on its own, to demonstrate that a child has been abused. Myers, supra at 59-60. The CBC contains too few questions about sexual behavior to “permit an adequate description of the range of possibly affected behaviors.” Id. at 60. Myers stated that an instrument has recently been developed which focuses on sexual behavior; that instrument is the Child Sexual Behavior Inventory (CSBI). Id. at 60 & n. 250. It appears that the CSBI, modeled after the CBC, was introduced in 1988. Nothing in the record suggests that the CSBI was used in the evaluation of any of the plaintiffs. Also troubling about the use of the CBC is that for many plaintiffs, the relevant CBC’s were not in the medical files kept by Boys Town. Defendants were not allowed to examine the CBC’s because they could not be found. Without reference to the results of the tests the defendants and the court cannot examine the efficacy of conclusions based on their use.

2. Clinical Interview

I am most troubled by the methodologies utilized in the clinical interviews of the plaintiffs. Plaintiffs’ witnesses assert that they have established a methodology accepted by the relevant scientific community and that the methodology has been subjected to peer review. I am unpersuaded by plaintiffs’ argument. First, as noted above, the protocol submitted by plaintiffs was not the protocol, in its entirety, uséd in the plaintiffs’ clinical interviews. Second, the protocol as written leaves a gaping hole in the direction it provides the master’s level clinician to conduct the interview. As written, the protocol allows for the use of anatomically-correct dolls as well as art or graphic materials to assist in the diagnosis of the child but does not provide specific direction for their use. Nothing in the record demonstrates that the clinician’s specific training is in line with any standardized, validated protocol or methodology. Evidence in the record suggests that their training fails to rest on such a principled basis. For instance, Scanlan stated that Boys Town employs no standardized procedures for the use of dolls in role playing. (Scanlan 256:9-256:19). The interviewer’s use of dolls is “highly dependent on the situation and the child.” (Id. at 256:24-25). The court has absolutely no way of evaluating the actual methodologies utilized by either the Boys Town evaluation teams or by Jones because no verbatim records of the interviews exist. No video or audio tapes were ever made of the interviews. Boys Town retains no contemporaneous notes of the interviews. Without verbatim records, it is impossible to determine, for instance, whether the interviewer inadvertently suggested an answer or incorrectly interpreted a response from one of the plaintiffs, especially given the plaintiffs’ retardation and language difficulties.

RELIABILITY OF THE EXPERTS’ TESTIMONY

In this action, I conclude plaintiffs have failed to demonstrate by a preponderance of the evidence that the witnesses’ opinions about the plaintiffs’ purported abuse is reliable. As with any case such as this, the psychological techniques employed are essentially “untestable” because their “very nature as an opinion as to the causes of human behavior, and the fact that the methods for testing the results ... are rife with potential for inaccuracy. Thus, the “key question” of testability cannot be conclusively answered.” State v. Foret, 628 So.2d at 1125.

Plaintiffs assert the error rate using their methodology is very low. I cannot conclude the evidence bears out such a conclusion. Attached to the affidavit submitted by Dr. Sullivan is an article, Berliner & Conte, Sexual Abuse Evaluations: Conceptual and Empirical Obstacles, 17 Child Abuse & Neglect 111, which discusses interview techniques and the difficulty in determining the reliability or error rate of abuse determinations. The authors contend that attempts to make determinations of the reliability of studies are problematic because there is no foolproof way to determine whether a child was actually abused. Interestingly, the article Dr. Sullivan relies upon to bolster the propriety of her interview techniques states,

Classification of a case in research, a case report, or in clinical practice is fundamentally a professional judgment or an opinion. In most cases there is no mechanism for establishing the absolute truth, whereby it is known for sure whether or not the child was actually abused.... Given the lack of an independent mechanism for knowing whether the child has been abused or not, and therefore, the reliance on opinion for classification, the validity of the judgment often cannot be absolutely established____ Nor has the reliability of the judgments been fully established in published reports. Without estimates for validity and reliability of those opinions, they should be approached with some caution since fundamental qualities of the judgments (e.g., accuracy, stability over time) are not known.

Id. at 113.

Discussing the known or potential rate of error in this type of case, the Louisiana Supreme Court stated,

One of the few sources for validation of expert determinations of the existence of child sexual abuse is found in a study conducted that examined over 100 cases of child sexual abuse where the perpetrators confessed or acknowledged the abuse. Faller, “Criteria for Judging the Credibility of Children’s Statements About Their Sexual Abuse”, 67 Child Welfare 389 (1988), cited in Myers, supra, at 75-76. Faller described three factors that were crucial in determinations of abuse. They were “information about the context of the sexual abuse, the description or demonstration of the sexual victimization, and the victim’s emotional state.” Faller at 391.

State v. Foret, 628 So.2d at 1126. Those “crucial” factors are simply not present in this case. In each instance of purported abuse, the plaintiffs could not or did not provide any information about the “context” of the abuse or provide any description or demonstrate the alleged victimization. Plaintiffs almost never were able to state that any person had performed any specific act which constituted, in the witnesses’ minds, abuse. It was in large part for this reason that nearly each time one of the doctors was asked in deposition whether they had an opinion as to the identity of the perpetrator of the abuse, the place at ESU 16 where the act constituting abuse took place or the context in which the act took place, the doctor said “no.”

Plaintiffs have failed to demonstrate by a preponderance of the evidence that their experts’ methodologies for evaluating the plaintiffs in this particular case are reliable for the investigative purposes plaintiffs now seek to use them. The witnesses all testified that their purposes in evaluating plaintiffs were for the provision of therapy, not investigation. The methods used here may well have been sufficiently rehable for purposes of choosing a course of psychotherapy for these disturbed children, a course which must, to some extent, rely upon perception as well as reality, and upon the subjective reports of parents and others. However, the methodologies have not been shown to be reliable enough to provide a sound basis for investigative conclusions and confident legal decision-making. See, Daubert, — U.S. at-, 113 S.Ct. at 2796 (“ ‘Fit’ is not always obvious, and scientific validity for one purpose is not necessarily scientific validity for other, unrelated purposes.”)

It may be that current standardized methodologies, if shown to have been adhered to rigorously, might satisfy the Daubert requirements and meet the standards of Rule 702, but the plaintiffs have failed to show that such standards were met in this case. Given the plaintiffs’ retardation and communicative difficulties, the lack of evidence demonstrating the precise methods used and their reliability, I cannot conclude that the witnesses’ opinions regarding abuse are sufficiently reliable to be admissible under the rules.

Based on the Eighth Circuit’s decision in United States v. Whitted, supra, and the evidence presented by the parties regarding the Daubert hearing, I have made the following conclusions regarding the proffered testimony of Scanlan, Sullivan and Jones. These witnesses may not testify that any plaintiff was abused in any manner, nor may they testify to any opinion based on that conclusion. See Whitted, 11 F.3d at 787. The witnesses may testify to the characteristics of non-retarded abused children; they may testify to the behaviors plaintiffs exhibited which are consistent with those behaviors, limited to those which the witnesses personally observed. See Id. Finally, the witnesses may not testify as to any opinion that plaintiffs’ behavior is consistent with abuse of any kind. While Whitted generally allows such testimony, plaintiffs have failed to demonstrate the reliability of the bases for such opinion testimony, as required by Daubert.

INTERLOCUTORY APPEAL

Plaintiffs have indicated that the testimony of these witnesses is essential to their case. They have indicated that in the event I make a ruling such as found in the memorandum above they wish to make an interlocutory appeal to the Eighth Circuit. See 28 U.S.C. § 1292(b). I agree that the issues involved in this matter are central to the case, and that their resolution by the Court of Appeals may well ultimately hasten the final resolution of this dispute. Because of the short amount of time until the trial date, I shall allow plaintiffs 24 hours from the entry of this order to indicate, via telephone, whether they wish to appeal this order. If plaintiffs do wish to appeal the order, I shall stay the action until the Eighth Circuit has disposed of the appeal..

IT THEREFORE HEREBY IS ORDERED that defendants’ motion in limine (filing 282) is granted in part and the testimony of witnesses Seanlan, Sullivan and Jones is limited as follows:

1. The witnesses shall not testify to any conclusion that any plaintiff was abused in any way nor shall they testify as to any opinion based on such a conclusion.

2. The witnesses may testify to the behavior characteristics of non-retarded abused children; they may testify to the behaviors plaintiffs exhibited which are consistent with those behavior characteristics, subject to the limitations of the above memorandum.

3. The witnesses shall not testify as to any opinion that plaintiffs’ behavior is consistent with abuse of any kind.

4. If they wish to appeal, plaintiffs are directed to contact the undersigned, via telephone, of their intention within 24 hours of the entry of this order, whereupon I shall stay the action pending disposition of the appeal.

5. The clerk shall notify counsel immediately by telephone and facsimile copy, of the entry of this order. 
      
      . The parties have consented to have this action proceed before me pursuant to 28 U.S.C. § 636(c). (Filing 153).
     
      
      . Although plaintiffs have not submitted a designation of expert testimony, their answers to interrogatories indicate that the experts also intend to testify that the school failed in its mission to educate the plaintiffs; that the plaintiffs’ earning capacity has been diminished by the abuse; and that plaintiffs’ families need psychotherapy as a result of the injuries.
     
      
      . Daubert held that the adoption of the Federal Rules of Evidence, particularly Rule 702, superseded the “general acceptance” test announced in Frye v. United States, 293 F. 1013 (D.C.Cir. 1923), described as the "dominant standard for determining the admissibility of novel scientific evidence at trial” Daubert,-U.S. at-, 113 S.Ct. at 2792.
     
      
      . The parties have cited, and the court’s research has discovered, no federal cases applying the Daubert standard to this particular type of evidence.
     
      
      . The mental age of the plaintiffs could have some bearing on the witnesses’ ability to accurately make a determination of abuse. Several commentators concluded,
      [M]ost cases can be validated by investigative interviewing and by assessing the credibility of the history of sexual abuse elicited from the child. In our experience this can be done with children in the age range of five years and older.... [A child aged three to five] frequently lacks the verbal and conceptual skills required for investigative interviewing to have validity.
      Sgroi, Porter & Blick, Validation of Child Sexual Abuse, in Handbook of Clinical Intervention in Child Sexual Abuse (S. Sgroi ed. 1982) quoted in Myers, supra, at 75. One concern in the present case remains the lack of communication and conceptual skills possessed by the plaintiffs, especially given their mental retardation.
     
      
      . In the affidavit she indicates that she conducted the evaluation in accordance with the protocol developed and utilized by Dr. Sullivan. As will be further discussed later, see footnote 15, the affidavit is inconsistent with the deposition testimony. First, Sullivan’s protocols were not in written form at the time Jones began Lute’s psychotherapy. Second, the protocol calls for video taping; Jones did not video tape her sessions with Lute. Third, Sullivan's protocols are designed strictly for evaluations, not therapy. Jones stated her evaluations took place within the context of therapy.
     
      
      . At the time Tommy made this statement, he was no longer a student at ESU 16 and had been mainstreamed at a high school. Jones could not determine where Tommy had gained knowledge of the term.
     
      
      . By the same reasoning, the witnesses’ opinions could not be based solely on the statements of the parents of the children.
     
      
      .The opinion may be found at 11 F.3d 782 (8th Cir.1993). This opinion was decided on rehearing and vacates United States v. Whitted, 994 F.2d 444 (8th Cir.1993), an opinion cited by both parties in this action. Whitted is a post -Daubert case which concerned testimony by a physician that based on the physical examination of an alleged child sex-abuse victim and the patient’s statements, he concluded the child had been sexually abused. The opinion, however, does not discuss the Daubert factors.
     
      
      .In Commonwealth v. Dunkle, 529 Pa. 168, 602 A.2d 830, 832-36 (1992), the court described a study by Lenore Walker, contained in her book Handbook on Sexual Abuse of Children, Assessment and Treatment Issues, 77 (1988), which included a "compilation of a study showing what percentage of sexually abused children exhibited what behaviors." Summarizing the chart included in the opinion, the court stated,
      As this chart graphically demonstrates, sexually abused children (1) cannot be fit into any specific behavior patterns; (2) for every symptom that was exhibited by any percentage, an even larger number do not exhibit that symptom; and (3) not one, single symptom was exhibited by a majority of sexually abused children. Clearly, these types of percentages cannot constitute probative evidence.
      
        Commonwealth v. Dunkle, at 178 n. 16, 602 A.2d at 835 n. 16.
     
      
      . Similarly, plaintiffs produced no evidence of a consensus that such indicators exist for physically or emotionally abused children, especially given the lack of physical evidence of abuse, as is the case with nearly all of the plaintiffs.
     
      
      . Although Dunkle was decided prior to Daubert, the case discusses at great length the reliability of this type of evidence. It also relies heavily on the Third Circuit’s decision in United States v. Downing, 753 F.2d 1224 (3d Cir.1985), which is cited with approval in Daubert.
      
     
      
      .The court further discussed the possibilities this type of expert testimony could conflate actions which are "normal” with actions caused by sexual abuse. The court noted,
      Richard A. Gardner, M.D., a practicing child psychiatrist in a recent book entitled Sex Abuse Hysteria, Salem Witch Trials Revisited (1991), contends that many normal behaviors are often taken as "evidence” of child abuse, namely bedwetting in young children, nightmares, temper tantrums, and masturbation. Id. at 60-65. Furthermore, many of the so-called abnormal behaviors attributed to victims of sexual abuse in fact have "nothing to do with Sex Abuse.”
      
        Commonwealth v. Dunkle, 529 Pa. at 178 n. 15, 602 A.2d at 834 n. 15. See also State v. Cressey, 628 A.2d at 701 (criticizing the evidence in that case as lacking "any recognizable, logical nexus between many of the identified symptoms and the conclusion that the children have been sexually abused____ [M]any of the indicators that were noted during the evaluations as signs of abuse, when viewed independently, do not even suggest that anything is wrong with the children, much less lead to á logical conclusion that they have been sexually abused.”)
     
      
      . Although Daubert indicated that Rule 702 had supplanted Frye, it indicated that the "general acceptance” test remained one of the considerations for the reliability analysis. Daubert, - U.S. at-, 113 S.Ct. at 2797.
     
      
      . I am concerned by this assertion in the affidavits. First, exhibit 16 indicates that one of the protocols for the clinical interview is to video tape the sessions. Sullivan and Scanlan acknowledged that no video tapes were made of any of the plaintiffs and that the actual practice at Boys Town at the time the interviews took place was to not record the interview in any manner. The practice of video taping the interviews was begun at Boys Town relatively recently and after the relevant interviews were completed. It is obvious that the protocol which Sullivan and Scanlan allege they utilized was not actually utilized. While some of the procedures which later evolved into a "protocol” might have been used, Sullivan acknowledged in her deposition that they were not final then, and had not yet been published. This protocol was finalized after the plaintiffs' interviews. Because of the general statements in the affidavits, it is utterly impossible for the court to determine just which portions of the protocol were actually involved in the interviews.
      My second concern lies with the same assertion in Jones' affidavit. She also stated in depositions that she does not video tape or otherwise record interview sessions. Further, she testified that she reached her conclusions about the alleged abuse of plaintiff Lute after conducting therapy with him. Jones did not conduct an evaluation for abuse; rather, she treated the plaintiff for abuse and later concluded that the abuse occurred. Her statements in the deposition and affidavit are thus inconsistent.
     
      
      . As a general matter, the role of anatomically correct dolls as a diagnostic or interpretive tool to determine whether sexual abuse has occurred is "controversial.” Myers, supra, at 63 n. 259. “The dolls are not a test for sexual abuse, and it is not appropriate to base conclusions on doll play alone. However, it is proper to employ the dolls, and to consider the child’s interaction with them as a small but relevant piece to the puzzle. Id. at 84. Given this particular use’s controversial nature, the lack of evidence regarding proper training of clinicians in their use and the lack of any video tapes to ensure the dolls’ proper usage, plaintiffs have failed to demonstrate that any evidence of abuse based on the diagnostic utilization of the dolls would be sufficiently reliable to assist the jury in this action.
     
      
      . Obviously, if one may not determine whether a child was actually abused, she may not reliably state that a psychologist's opinion that the child was abused was accurate. Thus, it would be impossible to determine the error rate of past studies utilizing a particular methodology.
     
      
      . Defining "context” Faller states:
      The context of the sexual abuse includes when and where it happened; what the victim and offender were wearing; what clothing was removed and by whom; where other members of the family were; how the perpetrator induced the child to become involved; and whether the offender admonished the victim not to tell, or said anything else in the course of the abuse. Sometimes the child will relate an idiosyncratic event that occurred during the course of the abuse that enhances his or her credibility.
      Faller at 391, quoted in Myers, supra at 75-76 n. 311.
     
      
      .Further, because the precise actions constituting the alleged abuse could not be described, the meaning of the witnesses' conclusions about "abuse” are unclear. The three types of abuse have no accepted psychological definition. The experts each relied upon a different definition of what constitutes abuse. For instance, Jones' definitions of the various types of abuse keyed on Colorado law; Scanlan's definitions keyed on Minnesota law. It does not appear that any of the witnesses' definitions of sexual abuse was the same as that which appears in the pretrial order—that being the definitions used by Nebraska statutes. The amorphous definitions of abuse often led to questionable conclusions. For instance, Sullivan’s definitions of sexual abuse in-eludes allowing a child to view a sexual act. Physical abuse, under the various definitions, includes a teacher spanking a child, regardless of how light a spank it might be; however, a parent spanking a child in precisely the same manner might not be physically abusing the child.
     
      
      .The exceptions were those instances where a plaintiff had mentioned a staff person or reacted negatively to a doll given the name of a staff person.
     
      
      . For instance, the witnesses’ opinion that plaintiffs suffered permanent injuries is based on the conclusion that plaintiffs were abused. The witnesses may not testify to such an opinion.
     
      
      . Allowing testimony of the plaintiffs' behavior based on, e.g., the CBC's or on information provided by the parents or clinical interview would allow the witnesses to vouch for another person’s credibility or rely on evidence which has not been shown to be reliable. See Azure, supra.
      
     