
    UNITED STATES of America, Plaintiff, v. Julio Adalberto HIDALGO, Sr.; Julio Hidalgo, Jr., Defendants.
    No. CR-01-1011-PHX-FJM.
    United States District Court, D. Arizona.
    Nov. 6, 2002.
    
      Thomas M. Hoidal, Hoidal & Hannah, PLC, Phoenix, AZ, for Julio Adalberto Hi-dalgo, Sr.
    Alex D. Gonzalez, Gonzalez & Smith, PC, Mesa, AZ, for Julio Hidalgo, Jr.
    Richard I. Mesh, U.S. Attorney’s Office, Phoenix, AZ, for U.S.
   ORDER

MARTONE, District Judge.

The defendants challenge the admissibility of the opinions of a forensic document examiner on whether certain writings and handprintings are theirs. We have read Hidalgo Sr.’s motion for a Daubert hearing regarding handwriting analysis (doc. 18), Hidalgo, Jr.’s joinder, the Government’s memorandum in opposition, Hidalgo, Sr.’s reply, Hidalgo, Sr.’s notice of expert witness testimony, the Government’s notice of expert witness testimony, the Government’s notice of supplementary testimony, the Government’s second notice of supplementary testimony, Hidalgo, Sr.’s supplemental memorandum re: admissibility of handwriting analysis, and the Government’s post-Daubert hearing brief in support of proffered handwriting expert.

We granted the defendants’ motion for a Daubert hearing and heard testimony from William J. Flynn, a forensic document examiner, and Dr. Moshe Kam, a professor of electrical engineering, on behalf of the government’s position. We also heard the testimony of Dr. Michael J. Saks, a professor of psychology and law, on behalf of the defendants. We first summarize the evidence and our findings. We then describe our understanding of the law on the issue in the post -Daubert, post-Kumho setting. We then reach our conclusions.

I.

A.

Handwriting analysis is based upon the premise that each person’s handwriting is unique. {See Flynn Aff. at 2-3). This assumption is key because it is uniqueness that allows a handwriting analyst to establish authorship against all other writers in the world. To establish uniqueness, the government points to a recent study of Professor Sargur Srihari. Sargur Srihari et al., Individuality of Handwriting, 47 J. Forensic Sci. 856 (2002). Professor Sri-hari’s research team scanned the handwriting samples of 1,500 individuals into a computer. The computer was then programmed to analyze and compare the samples based on a variety of features such as slant, height, the number of interior contours, and the number of vertical slope components. When the computer was asked to match the exemplars, it was able to do so with a 98% accuracy rate. The government argues that this study inferentially proves that handwriting is unique because otherwise the computer would not have been able to differentiate among the exemplars. Yet the Srihari study fails to establish uniqueness. At most, we can reasonably infer that among 1,500 writers, very few write in a similar way.

The government also points to the existence of several studies arguably proving that the writing of identical twins, while strikingly similar, is nonetheless distinguishable. Horatio H. Newman et al., Twins: A Study of Heredity and Environment (1937); Mary S. Beaeom, A Study of Handwriting by Twins and Other Persons of Multiple Births, 5 J. Forensic Sci. 121 (1960); D.J. Gamble, The Handwriting of Identical Twins, 13 Can. Soc’y Forensic Sci. J. 11 (1980); J.H. Wanscher, The Hereditary Background of Handwriting: An Investigation of the Handwritings of Mono and Dizygotic Twins, 18 Acta Psy-chol. Et Neurology 23. Each of the four studies in evidence was based on an evaluation of the handwriting of identical twins. In only two of these was the evaluator blind as to whether the exemplars were those of an identical twin. The two blind studies were designed to determine whether handwriting has a genetic basis, not whether the handwriting of identical twins is distinguishable. Because of this, the studies adopted classification schemes that are too imprecise for our purposes. One study classified the writing of thirteen out of twenty-nine identical twins as “identi-eal[ ] or very similar! ].” Wanscher, supra, at 360. The other grouped many twins as writing “alike.” Newman, supra, at 125-26. We thus do not know whether any of those who wrote “alike,” wrote identically.

That leaves us with the two non-blind studies directed at the question of whether identical twins ever write identically. The authors reported differences in the handwriting of identical twins. Yet in each study a single evaluator applied an intrinsically subjective protocol. Because forensic document examiners assert that no person writes the same way twice (see Flynn Aff. at 2-3), it is hard to say how the examiners accurately concluded that none of the participants wrote identically. Forensic document examiners were not asked to distinguish between the handwriting of identical twins in any of these studies. We therefore do not know whether the handwriting of identical twins is sufficiently differentiated for practical purposes.

We are, of course, aware that it would be impossible to analyze and compare the handwriting of every literate person. Uniqueness must therefore be demonstrated, if at all, inferentially. Although we can speculate as to one way in which such a demonstration might be made, no such showing is in evidence. At the end of the day, we are left with the assertion of the forensic document examination community that, in their experience, handwriting is unique.

B.

The government has been more successful in establishing that forensic document examiners possess skills that exceed those of lay persons. A study by Professor Moshe Kam is the most useful. Moshe Kam et al., Writer Identification by Professional Document Examiners, 42 J. Forensic Sci. 778 (1997). Professor Kam asked more than one hundred forensic document examiners and forty-one non-professionals to determine the authorship of six unknown documents from a library of twenty-four exemplars. Although professionals and non-professionals made correct matches at about the same rate, the false positive rate for professionals was 6.5% compared to 38.3% for non-professionals. In other words, the non-experts were almost six times as likely to make a match where no such match should have been made.

Additional studies by Professor Kam and others provide support for the conclusion that forensic document examiners are more accurate than laypersons. Professor Kam conducted a study on the ability of forensic document examiners to identify signatures. Moshe Kam et al., Writer Identification by Professional Document Examiners, 42 J. Forensic Sci. 778 (1997). Forensic document examiners demonstrated a false positive error rate of just .5%.

A study conducted in Australia by Bryan Found and Doug Rogers reported that forensic document examiners were correct 91.5% of the.time in declaring a signature a forgery. Bryan Found & Doug Rogers, Revision and Corrective Package: Signature Trial, (2001) (unpublished CD-ROM). When the examiners identified a signature as genuine, they were correct 98.2% of the time. ,

The defendants argue that notwithstanding these studies there is no significant evidence of the reliability of handwriting identification. The defense position is not that handwriting identification is unreliable, but instead that the government has not demonstrated its reliability. Professor Saks made it clear that he is “agnostic” on whether document examiners are reliable. The defense contends that the Kam studies provided the participants with unrealistically easy tasks. We have considered this question and conclude that the tasks were sufficiently rigorous. Moreover, the task presented in Professor Kam’s second study is closely analogous to the task in this case.

The defendants point to the lack of incentives for non-professionals. They argue that professional document examiners knew that the status of their profession rode on the outcome of these tests. They therefore had every incentive to be cautious in declaring matches. Non-professionals, in contrast, had no incentive to be careful and were- thus more cavalier in declaring matches. This disparity, the argument goes, explains, why professionals made fewer false positive errors than nonprofessionals. Professor Saks asserts that the effect of lack of incentives can be seen in the bi-modality of the data in Professor Kam’s pilot study. Professor Saks claims that while most non-professionals performed poorly, a few performed as well as professionals. He contends . that this shows that some non-professionals were motivated while others were not, and that motivation positively correlates with outcome.

We do not agree. The worst professionals made two errors. The best non-professionals made about nine errors, while the worst non-professionals made about forty-four errors. Even the worst professionals clearly outperformed the best non-professionals. We conclude that the effect, if any, of incentives has not been sufficiently developed to affect our analysis. For example, Professor Kam conducted another study in which non-professionals were divided into four groups and exposed to four different monetary incentive schemes. Moshe Kam et al., Effects of Monetary Incentives on Performance of Non-Professionals in Document-Examination Proficiency Tests, 43 J. Forensic Sci. 1000 (1998). He found no statistically significant differences among the groups. While a small monetary reward would not be as strong an incentive as the prospect of losing one’s professional livelihood, the available evidence suggests that this concern might go to the weight rather than the admissibility of the evidence.

II.

Rule 702 of the Federal Rules of Evidence governs the admissibility of expert testimony and provides that:

If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise, if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.

Fed.R.Evid. 702. The Supreme Court has directed district courts to serve as gatekeepers and to determine the reliability of expert testimony within the meaning of Rule 702 before admitting it. Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579, 597, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993); Kumho Tire Co. v. Carmichael, 526 U.S. 137, 141, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999).

In Daubert, the Supreme Court identified five factors to consider in admitting or excluding expert testimony: (1) whether the theory or technique can be or has been tested; (2) whether it has been subjected to peer review and publication; (3) the known or potential error rate; (4) the existence and maintenance of standards controlling the technique’s operation; and (5) its general acceptance within the scientific community. Daubert, 509 U.S. at 593-94, 113 S.Ct. 2786. These factors are neither mandatory nor exclusive. The district court can base its reliability determination on “Daubert’s factors or any other set of reasonable reliability criteria.” Kumho Tire Co., 526 U.S. at 158, 119 S.Ct. 1167. Moreover, a district court enjoys “broad latitude” both in deciding how to go about determining the reliability of challenged expert testimony and in its ultimate reliability determination. Id. at 142, 119 S.Ct. 1167.

Kumho made it clear that the Daubert gatekeeping obligation applies not only to “scientific” testimony, but to all expert testimony, including technical and other “nonscientific” expert testimony. Id. at 141, 119 S.Ct. 1167. In determining the reliability of nonscientific expert testimony, a district court may consider one or more of the factors outlined in Daubert. Id. However, the “test of reliability is ‘flexible.’” Id. Thus, “Daubert’s list of specific factors neither necessarily nor exclusively applies to all experts or in every case.” Id. Instead “the gatekeeping inquiry must be tied to the facts of a particular case.” Id. at 150, 119 S.Ct. 1167 (internal quotation marks omitted).

Before Daubert, handwriting analysis testimony was admissible under the Frye general acceptance standard. See, e.g., United States v. Fleishman, 684 F.2d 1329, 1337 (9th Cir.1982) (“It is undisputed that handwriting analysis is a science in which expert testimony assists a jury.”); Robles v. United States, 279 F.2d 401, 404-05 (9th Cir.1960) (“It is well settled that an expert in handwriting may testify and state his opinion as to whether different documents or signatures were written by the same person .... ” (internal quotation marks omitted)). Daubert and Kumho, however, have changed the general rules of admissibility. Because general acceptance is now but one of many factors a trial court may consider, it no longer serves as the sine qua non of admissibility. Courts are now confronting challenges to testimony, as here, whose admissibility had long been settled.

The United States Court of Appeals for thé Ninth Circuit has not addressed the issue of whether handwriting analysis is admissible under' Daubert' and Kumho. Only two federal circuit courts have considered the issue post-Kumho. Neither of them discussed the Daubert factors or any other indicia of reliability in determining that it was not an abuse of discretion to admit the challenged expert testimony. United States v. Paul, 175 F.3d 906 (11th Cir.1999); United States v. Jolivet, 224 F.3d 902 (8th Cir.2000).

We have located nine district court cases that have directly addressed the issue of whether the expert testimony of a forensic document examiner is admissible under DaubeH and Kumho. No consensus has emerged. Only two courts have found the testimony to be reliable and fully admissible. United States v. Gricco, No. 01-90, 2002 WL 746037, 2002 U.S. Dist. LEXIS 7564 (E.D.Pa. Apr. 26, 2002); United States v. Richmond, No. 00-321, 2001 WL 1117235, 2001 U.S. Dist. LEXIS 15769 (E.D.La. Sept. 21, 2001). Four courts have determined that the forensic document examiner’s testimony was not based on sufficiently reliable principles and methodologies under Daubert/Kumho and fully excluded the expert’s testimony. United States v. Lewis, 220 F.Supp.2d 548 (S.D.W.Va.2002); United States v. Brewer, No. 01 CR 892, 2002 U.S. Dist. LEXIS 6689 (N.D.Ill. Apr.12, 2002); United States v. Saelee, 162 F.Supp.2d 1097 (D.Alaska 2001); United States v. Fujii, 152 F.Supp.2d 939 (N.D.Ill.2000). Three courts reached a middle position, permitting the forensic document examiner to testify as to particular similarities and dissimilarities between the documents, but excluding the ultimate opinion as to authorship. United States v. Rutherford, 104 F.Supp.2d 1190 (D.Neb.2000); United States v. Santillan, No. CR-96-40169, 1999 U.S. Dist. LEXIS 21611 (N.D.Cal. Dec. 3, 1999); United States v. Hines, 55 F.Supp.2d 62 (D.Mass.1999).

III.

The Government proposes to have Mr. Flynn testify, based upon his examination and, comparison of questioned items and known documents, that (1) Hidalgo, Jr. executed the bulk of the handprintings and text appearing on the original uniform residential loan application from Nelson Hernandez, (2) Hidalgo, Jr. probably executed the writings on a 1994 W2 form, (3) there are indications that Hidalgo, Sr. executed the handprintings and text appearing on the original request for verification of employment form addressed to Sandoval Masonry dated February 10, 1997, (4) neither Hidalgo, Jr. nor Hidalgo, Sr. could be identified as having written the writings and text on the original request for verification of employment form addressed to R.C. Nutrition Center dated February 6, 1996 and the request for verification of employment form addressed to El Tanam-pa Restaurant dated June 21, 1995, and (5) he can read certain typewriting on the 1994 W2 form.

The evidence produced at the Daubert hearing and the reports and affidavits received support different findings as to whether an expert can testify regarding the identity of handwriting and handprint-ing, on the one hand, or something less than that, on the other.

A. Opinion Testimony as to Authorship of Questioned Documents

The Government offers the uniqueness of handwriting as a scientific principle. But there is no evidence before me to support the thesis that handwriting is unique. The Srihari study supports the proposition that very few people write in a similar way. Mr. Flynn asserts the uniqueness of handwriting but while the hypothesis is testable within the meaning of Daubert, it has not been fairly tested. It is true that the uniqueness principle is generally accepted in the forensic document examiner field, but that in itself is insufficient under Daubert. Peer review in this area cannot be said to be any different from its general acceptance in the forensic document examiner community. Indeed, not even Professor Kam posited the theory that handwriting is unique. His research supports the proposition that document examiners are better than lay persons in excluding false positives. Professor Kam understands that while this may be helpful to a jury, it does not support the uniqueness principle upon which identification opinion testimony is based.

We therefore find and conclude that the principle of uniqueness of handwriting or handprinting fails to satisfy a Dau-bert/Kumho analysis. If the principle of uniqueness could be proven, then one would know how to analyze handwriting or handprinting with an error rate of zero percent. But there is no support for the proposition, nor does the government contend that document examiners have a zero percent error rate.

The foundation for a document examiner’s identification between a known document and a questioned document is the principle of uniqueness. Because the principle of uniqueness is without empirical support, we conclude that a document examiner will not be permitted to testify that the maker of a known document is the maker of the questioned document. Nor will a document examiner be able to testify as to identity in terms of probabilities.

B. The Mechanics and Characteristics of Handwriting, Including Similarities

Because the Government has failed to prove the principle of uniqueness, we have excluded any expert opinion testimony that the handwriting or handprinting on any questioned document is in fact the handwriting or handprinting of the defendants. In contrast, the Government has proven that forensic document examiners possess skills that are better than those of lay persons in identifying the authorship of questioned documents. While it is true that professionals and non-professionals make correct matches at about the same rate, the false positive rate for non-professionals is about six times that of professionals.

Professor Kam tested the hypothesis that questioned document examiners possess a skill different from that of lay persons. I find and conclude that his conclusion is credible. This proposition has been tested, and has been subject .to peer review and publication. Error rates have been established, and, of course, there has been general acceptance. Kumho instructs us that where the basis for an expert’s testimony, is not scientific (as here), “the relevant reliability concerns may focus upon personal knowledge or experience.” 526 U.S. at 150, 119 S.Ct. 1167. And, as the Court acknowledged, id. at 151, 119 S.Ct. 1167, “some of Dauberb’s questions can help to evaluate the reliability even of experienced-based testimony.”

It is intuitive that someone who is trained in and has experience in the analysis of handwriting is likely to be better at it than someone who is not. The Government’s evidence in this case proves this to be so. Mr. Flynn uses a methodology which is the standard of the American Society of Testing and Materials. He first examines the known writings and then makes a side-by-side comparison to the questioned writings. He sees whether they have general features in common. He then sees if they have individual patterns. He evaluates strokes and characteristics, and the personal alphabets of the known and questioned writings. While the failure of proof of the uniqueness principle would preclude him from rendering an opinion of identity, he could, based, upon his experience and training, testify to the mechanics and characteristics of handwriting, his methodology, and his comparisons of similarities and dissimilarities between the defendants’ known writings and those of the questioned documents. He could point out to the jury things that the jury might not see on its own. It would then be left to the jury to make the ultimate finding of identity or non-identity.

IV.

For the foregoing reasons, we GRANT the defendants’ motion to exclude expert opinion testimony that the handwriting on the questioned documents is in fact the handwriting of a defendant. We DENY the defendants’ motion to exclude testimony on the mechanics and characteristics of handwriting or handprinting, methodology, comparisons of similarities and dissimilarities, and any other factors that would be helpful to the jury in making a finding of identity or non-identity, short of an ultimate opinion.

We acknowledge that today’s ruling is applicable to a case set for trial in the fall of 2002. We are not unmindful of the fact that in light of the pressure brought to bear on forensic document examination (and other areas of expertise) by Dauberb and Kumho, further research, testing, and publication are likely to proceed at an accelerated pace and thus future rulings on this topic may be influenced by future developments. 
      
      . The government also points to the existence of the Forensic Information System for Handwriting ("FISH”), a computer system in use by the United States Secret Service. Evidently, the Secret Service has been able to match the handwriting of several individuals who have written threatening letters to government officials. The papers describing FISH do not allow us to draw a meaningful conclusion about FISH's effectiveness.
     
      
      
        . This study notes Francis Galton’s 1883 report of a pair of identical twins who wrote so much alike that neither twin was able to recognize his own writing. Wanscher, supra at 350.
     
      
      . One way would be a mathematically rigorous application of the product rule, which proposes that if a forensic document examiner finds two rare traits in an individual’s handwriting, the examiner can multiply the rates to determine the rate at which one can expect to find both of them.
     
      
      . The government has submitted a re-analysis by Professor Kam which compares the accuracy rates of professionals with printed documents in contrast to cursive or mixed writing. The false positive error rate for printed documents was 9.3% compared to 5.5% for non-printed documents. Professor Kam states that this increase is not statistically significant. He was unable to include the results of three of the packets that were used in the original study. This caused the false positive rate for all documents to go from 6.5% in the original study to a little over 7% in the reanalysis.
     
      
      . Professor Kam also conducted a pilot study that examined the identification skills of seven FBI-trained forensic document examiners visa-vis ten lay persons. Moshe Kam et al., Proficiency of Professional Document Examiners in Writer Identification, 39 J. Forensic Sci. 5 (1994). The professional group made four errors while the non-professional group made 247.
     
      
      . Accuracy was 55.3% when participants stated an opinion with regard to a disguised signature. A disguised signature is one in which the author has tried to make it appear as a forgery but is in fact genuine.
     
      
      . The defense does, however, argue that the results reported by Collaborative Testing Services, Inc. ("CTS”) raise serious questions as to the reliability of handwriting analysis. CTS issues tests by which individuals or laboratories may evaluate their forensic skills. CTS reports the results that are returned to it. Since this data lacks a control group and other hallmarks of scientific rigor, we place no significant weight on the CTS data.
     
      
      . In contrast to the uniqueness principle, which purports to be based on science (see part III(A) supra), a document examiner's superior skill is based on training and experience.
     