
    Iraj DANAIPOUR, Petitioner, Appellee, v. Kristina McLAREY, Respondent, Appellant.
    Nos. 02-1065, 02-1070.
    United States Court of Appeals, First Circuit.
    Heard March 6, 2002.
    Decided April 3, 2002.
    
      Elizabeth B. Burnett with whom Beth I.Z. Boland, Jennifer Sulla, Francine Wachtmann, Meredith Brand Wade, and Mintz Levin Cohn Ferris Glovsky and Po-peo, P.C. were on brief for appellant.
    Stephen J. Cullen with whom Mary A. Azzarito, Nancy J. Baskin, and Miles & Stockbridge, P.C. were on brief for appel-lee.
    Barbara F. Berenson, Assistant Attorney General, Thomas F. Reilly, Attorney General, and Deepa S. Isac, Assistant Attorney General, on brief for the Commonwealth of Massachusetts, amicus curiae.
    Barry S. Pollack, Anthony L. Bolzan, Amber R. Anderson, and Dechert Price & Rhoads on brief for Leadership Council for Mental Health, Justice, & the Media, Massachusetts Society for the Prevention of Cruelty to Children, Community Legal Services and Counseling Center, Children’s Law Center of Massachusetts, Inc., Gloucester Men Against Domestic Abuse, and Emerge, Inc., amici curiae.
    Nora Sjoblom Sanchez and Pauline Qui-rion on brief for Massachusetts Citizens for Children, Women’s Bar Association of Massachusetts, Greater Boston Legal Services, Jane Doe Inc., Domestic Violence Council, Inc., and Rebecca M. Bolen, amici curiae.
    Before LYNCH, Circuit Judge, CAMPBELL and BOWNES, Senior Circuit Judges.
   LYNCH, Circuit Judge.

Among the federal courts’ most difficult and heart-rending tasks is the decision under the Hague Convention on the Civil Aspects of International Child Abduction whether to return an abducted child to the child’s home country when a parent claims the child will face a grave risk of physical or psychological harm if returned. Hague Convention on the Civil Aspects of International Child Abduction, opened for signature Oct. 25, 1980, T.I.A.S. No. 11,670, 1343 U.N.T.S. 89 [hereinafter Hague Convention], In the Convention’s enabling legislation, International Child Abduction Remedies Act, 42 U.S.C. §§ 11601-11610 (1994), Congress has assigned the federal courts jurisdiction over actions arising under the Hague Convention, including the question of whether return poses a grave risk of harm to the child. Id. § 11603.

The district court struggled conscientiously here and concluded that, though there was reason to believe the father had sexually abused his three-year old daughter, the Swedish courts should ultimately decide whether sexual abuse had occurred and, if so, what steps are necessary to protect the children. The court held that numerous “undertakings,” with what the court assumed would be parallel enforcement by the Swedish courts, would adequately protect the children until the Swedish courts could decide the matter.

Without deciding whether a United States court may ever defer its responsibility to determine the grave risk of injury question to a foreign court, we hold on these facts that the court erred and that it overstepped its authority in issuing conditions under which it thought the children could be returned. We reverse and remand for further proceedings consistent with this opinion.

I.

We summarize the facts of this case, as they were presented to the district court.

Kristina McLarey is a dual citizen of Sweden and the United States; Iraj Da-naipour is a Swedish citizen and an Iranian national. Danaipour is a practicing child psychologist. The two met in Sweden. McLarey and Danaipour had their first daughter in 1994; shortly thereafter, the two were married in Massachusetts. Their second daughter was born in 1998. Both children were born in Sweden and lived there for most of their lives. In the summer of 1999, during a visit to Massachusetts, Danaipour announced to McLar-ey’s parents that the two would be divorcing. In February 2000, the two filed for divorce, although they continued to eohabi-tate in their condominium in Stockholm.

During the first half of 2000, McLarey and Danaipour’s relationship deteriorated further. McLarey alleges that Danaipour was abusive and controlling, and that she began to suspect he was having inappropriate sexual contact with their daughters. McLarey states that on several occasions she observed Danaipour pinching the girls’ nipples and squeezing the older girl’s buttocks. She also reports that the older daughter began to exhibit sexualized behavior, such as attempting to kiss her mother hard on the mouth.

In June 2000, McLarey traveled to the United States with the girls to visit her parents. The following month, Danaipour also came to the United States to visit the girls. McLarey alleged that, on two occasions following the girls’ visits with their father, she observed unusual redness in the younger girl’s vaginal areas. When Danaipour returned to Sweden, he petitioned for, and received, full custody of the girls from a Swedish court. McLarey then returned to Sweden, residing in a women’s shelter in Stockholm, and petitioned for joint custody. Approximately four weeks later, around October 2000, the Swedish court granted McLarey possession of the condominium and joint custody of the children, with physical custody to alternate between the two parents on a weekly basis. McLarey promised the Swedish courts she would not remove the children from Sweden again, and she surrendered her passport and those of her children to her Swedish attorney.

McLarey alleges that she again began noticing the vaginal redness after the girls returned from visits with their father. Upon the advice of a pediatric nurse, McLarey consulted a child psychologist. After that first visit with the child psychologist, McLarey says that she asked the younger child what caused the redness and that the child answered “Baba [her word for her father] do like this” and made a masturbatory motion. McLarey also claims she asked the older child, who had no answer. McLarey says she then told her older daughter that “no one should ever touch you anywhere unless you want them to,” to which the girl responded, “what would you say if I told you they had?” The girls did not see the psychologist, because that would have required their father’s consent.

On November 21, 2000, the psychologist issued a report of suspected child sexual abuse and referred the case to the Swedish social services administration, which then referred it to the Stockholm police. The police interviewed the older daughter for a little over one hour, and the younger daughter for about ten minutes; the younger child refused to speak with them at all. In January, the younger girl was examined medically, and nothing out of the ordinary was found. The police then terminated their investigation.

McLarey then returned to the Swedish social services agency, requesting a full professional sexual abuse investigation. The agency did initiate a general investigation, but informed McLarey that it could not investigate whether sexual abuse had occurred without consent from Danaipour, which was not forthcoming. McLarey then turned to the United States embassy and the members of the Swedish parliament for assistance, but to no avail. McLarey alleges that, during this time period, the younger daughter repeatedly stated that “Baba” had hurt her “pee pee” and that she exhibited symptoms of abuse, including recurrent nightmares, avoiding bowel movements, and sexually inappropriate behavior towards McLarey’s fiancé.

On March 29, 2001, McLarey filed a motion with the Stockholm District Court requesting a full sexual abuse investigation, which Danaipour opposed. McLarey filed letters from her fiancé and her mother with the Swedish court. The fiancé reported, among other things, that he heard the younger girl cry out “Baba ... ouchy ... no, no!” during the night; that he heard her say that Baba “hammers” her, making a motion between her legs; and that she had asked him to “play with her pee pee.” The mother’s letter reported that she had seen unusual vaginal redness in both girls following visits with their father; that the older girl had asked her, “what does [her sister] mean when she says that Baba does this?” making a masturbatory motion between her legs; and that the younger girl had made other comments complaining of pain in her vaginal area or expressing general fear of her father. On June 13, 2001, the Swedish court denied the motion for a full investigation.

Also during this time-frame, the Swedish authorities concluded the custody evaluation conducted as part of the divorce proceedings. On May 31, 2001, the Swedish authorities issued a report, based on home visits, meetings and visits to the girls’ schools, which found that “nothing has been established ... that suggests that either of the girls have been subjected to sexual molestation.... Both parents are very competent as parents and have a fine, close and natural contact with the daughters.” There was never a full investigation done in Sweden into the specific question of whether either girl had been sexually abused, however.

On June 25, 2001, McLarey left Sweden with the girls and returned to the U.S. The parties agree that this violated a Swedish court order. Upon arrival, she arranged for a sexual abuse evaluation, but the three-year-old refused to speak with the evaluator. McLarey also filed, on July 3, 2001, a declaratory action under the Hague Convention (which was later dismissed by agreement of the parties). On August 22, 2001, Danaipour filed a petition in the Family Court division of the Massachusetts state court seeking return of the children under the Hague Convention. On September 5, 2001, upon McLarey’s motion, the action was removed to the federal district court.

In September 2001, McLarey began taking the girls to therapy sessions with Dr. Toni Luxenberg, a clinical psychologist. Over the course of several months of weekly sessions with Dr. Luxenberg, the younger girl made various statements that could be taken as indicating that her father had her masturbate him and that he had masturbated himself in her presence. She also told the doctor that she did not ever want to go to Sweden to see her father. The older girl did not make any statements indicating that she had been sexually abused, but did tell Dr. Luxen-berg that her sister had told her about the abuse, that she thought her father did it, and that she was worried and frightened she would have to go back to Sweden.

The district court held an initial scheduling hearing on November 1, 2001. At that time, the court stated that

I’m not necessarily being asked in this case to decide whether the allegations of child abuse are proven by clear and convincing evidence, but I’m being asked to decide whether there are feasible conditions under which the children can be returned to Sweden so the Swedish authorities can decide those issues.

On the following day, the district court appointed a guardian ad litem (“GAL”) for the children and entered a scheduling order requiring that discovery be completed by December 17, 2001.

On November 21, Danaipour, along with his retained expert, proposed that the district court issue an order pursuant to Federal Rule of Evidence 706, ordering that the children be given an independent sexual abuse evaluation. Also on that date, the GAL filed her preliminary report, which emphasized the need for a “prompt resolution,” preferably before the Swedish school term began in January. On December 3, McLarey filed a motion requesting a forensic sexual abuse evaluation prior to trial. McLarey stated that she believed that the evidence in the record was sufficient to meet her burden of proof under the Hague Convention, but that the judge should order an evaluation if he thought it would be helpful or necessary for him to make a finding on whether sexual abuse had occurred. McLarey’s motion argued that a finding of sexual abuse would constitute a per se “intolerable situation” under the Hague Convention; that any evaluations performed in Sweden would not be effective; and that the results of an evaluation here could inform the court’s consideration of the possibility of using undertakings in the event of a return order. Along with her motion, McLarey submitted expert affidavits from two medical doctors indicating that the children’s willingness and ability to disclose information relating to any sexual abuse would be markedly diminished if they were returned to Sweden. Thus, as of December 3, both parties agreed to the need for a full evaluation and contemplated it would be done here under the court’s supervision. When the GAL made her November 21 recommendation urging prompt disposition of the case, the GAL did not then know that both parties would be seeking a full evaluation.

On December 4, 2001, the court held a preliminary hearing, at which it expressed a preference to conclude the trial in December, in accordance with the GAL’s recommendation, and inquired whether a full evaluation could be done in Sweden. At that point, Danaipour’s counsel expressed an absolute opposition to the possibility of any examination taking place in this country. The GAL stated that she did believe that an evaluation needed to be done, but she expressed no opinion as to whether it should be done in the U.S. or in Sweden.

On December 7, the court held a second hearing. At the hearing, Danaipour filed a written objection to an evaluation being done in the United States. However, he submitted no expert affidavits and thus did not controvert the expert testimony submitted by McLarey that an evaluation in Sweden was unlikely to succeed. The court held:

Contrary to what the respondent is contending, I’m not persuaded that evaluations of the children in Sweden will not be effective, given the additional information that’s been generated in this case, and I think it’s neither feasible nor appropriate, given the mandate of the Hague Convention to decide these matters promptly, and given the fact with the agreement of all the parties I’ve set a December 19 trial date, [I] ... deny the motion for a court ordered sexual abuse evaluation.

When the court made this holding, the only expert testimony before it was that it was extremely doubtful, at least, that an evaluation could be successfully performed if the children were returned; further, the GAL took no position as to where the evaluation should be performed, saying it was beyond her expertise. The court further noted that McLarey should have submitted her motion for an evaluation earlier. It does not appear that the court relied on this rationale, however, as it noted that it had “grave doubts” as to whether it would have been allowed even if filed earlier.

Prior to trial, Danaipour submitted proposed “undertakings” to which he would agree if the court ordered the children returned to Sweden. These included an agreement that the children could reside with McLarey in Sweden; an agreement that he would have limited or no visitation pending judgment of the Swedish court; and an agreement that he would participate in a forensic evaluation to determine if sexual abuse had occurred.

On December 14, 2001, shortly before the trial here began, the Stockholm County Court entered an order specifying that McLarey and Danaipour would continue to have joint custody; the children would have supervised visitation rights with their father every Saturday; and “[w]hen the children have been returned to Sweden, the court will consider the question of a child psychiatric evaluation.” (emphasis added). Thus, although the father had removed his earlier objection to a forensic examination of the children, the Swedish court reserved the issue of whether it would require one.

The GAL’s final report, filed on December 17, 2001, recommended that, if the girls were returned to Sweden, the younger child have only telephone contact with her father until a full sexual abuse evaluation was performed, or at least substantially underway, and a counselor was in place for her. Her recommendation was therefore in conflict with the order entered by the Swedish court three days previously.

Trial was conducted as scheduled from December 19 to December 21. The court limited each side to about nine hours of time to put in its case, without objection from the parties. At trial, Dr. Luxenberg testified that based on her treatment of the children, she believed that Danaipour had sexually abused his younger child. She testified specifically that she did not believe the girl had been coached to make a false accusation, as the revelation was made in age-appropriate language, it was accompanied by a distressed emotional reaction, and it contained new information, rather than a repetition of a rote script. Dr. Luxenberg also testified that she could not, at this stage, answer whether Danaip-our had sexually abused his older daughter. She also testified that, in her medical opinion, the younger girl suffers from Posi>-Traumatic Stress Disorder (“PTSD”).

McLarey also presented the expert testimony of Dr. Bessel van der Kolk, an expert in the field of child trauma. He concurred, based on his review of Dr. Lux-enberg’s notes and a short videotape of the younger child, that the younger girl suffers from PTSD. He testified that he did not agree that the older child suffers from PTSD at this time, but he testified that returning either child to her father at this time would be “devastating.” A third expert, Dr. Carole Jenny, a professor of pediatrics at Brown University Medical School and an expert in sexual abuse evaluations, also testified for McLarey. Dr. Jenny testified that, in her opinion, to a reasonable degree of medical certainty, the younger girl had been sexually abused. She specified that she did not believe the child had been coached. She based her opinion on a review of Dr. Luxenberg’s treatment notes and reports, a review of the Swedish social services and police reports, an interview with McLarey, and observation of the trial testimony. Dr. Jenny also testified that she did not believe an effective investigation could be conducted in Sweden, because the children would be unlikely to talk if returned. A Swedish lawyer, specializing in child abuse cases, also testified for McLarey as an expert on the Swedish legal system and on the feasibility of “undertakings” as a way to cabin the district court’s order. The expert testimony is discussed in more detail in the pertinent sections below.

Danaipour’s expert, Dr. Carlton Mun-son, a Ph.D. in clinical social work, but not an M.D., testified that in his opinion neither child suffered from PTSD. He also testified that one could not predict whether a return to Sweden would negatively affect the children, and suggested it might improve their outlook. Dr. Munson further testified that it was preferable that any evaluation take place in Sweden, “the jurisdiction where the abuse is alleged to have occurred,” because of “the familiarity with the system, familiarity with all of the factors in the locale, the area, what kind of services are available, and even in terms of the culture of the person that has been abused, as well as the culture of the abuser.” He did not explain this further.

Danaipour also testified, denying all allegations of sexual abuse and denying that he had been abusive toward McLarey. He testified that his daughters have seen him and other men naked, which he says is common in Sweden, but that he has never touched them, nor had them touch him, in a sexual way. He testified that he had also observed redness in the girls’ vaginal areas when he and his wife lived together, and that they had consulted health care professionals and were told that this was not out of the ordinary and could be treated with a simple ointment. His explanation for his wife’s accusations is that his wife left him to be with another man, an American, and is now trying to take the children with her, in violation of a Swedish custody order.

On January 2, 2002, the district court issued its decision. Danaipour v. McLarey, 183 F.Supp.2d 311 (D.Mass.2002). The court found Dr. Luxenberg’s testimony regarding the younger child’s statements to her to be credible and that “[t]hose statements provide good reason to be concerned that Danaipour may have masturbated in front of C.D. and, on one occasion, engaged her in that conduct. If that occurred it was, as another of McLarey’s experts, Dr. Carole Jenny, opined, a form of sexual abuse.” Id. at 322. However, the court also found that “a forensic evaluation is necessary to determine with a reasonable degree of reliability whether any form of sexual abuse has occurred and, if so, who the abuser was.” Id. The court further found that there was no evidence that the older child had been sexually abused in any way, and that neither child suffered from PTSD or would suffer from PTSD or any unusual psychological harm if returned with her mother to Sweden. Id. at 313. It concluded that “McLarey [has not] proven by clear and convincing evidence that the children’s return to Sweden on the conditions being ordered in ... this Memorandum will create the grave risk of psychological harm to them that would permit the court to deny Danaip-our’s petition.” Id.

The court ordered McLarey to return the children to Sweden at her own expense by January 16, 2002, with a list of twelve conditions, including: that the children reside with McLarey unless ordered otherwise by a Swedish court; that a forensic evaluation be conducted in Sweden and that both parents participate fully in the evaluation; that a Swedish court decide the implications of the forensic evaluation for the custody of the children; that Da-naipour have no contact with the younger daughter, unless ordered otherwise by a Swedish court; that Danaipour have only telephone contact three times a week with the older daughter unless the Swedish courts order otherwise; and that Danaip-our request that a court of Sweden enter the terms of the order as a “mirror order” enforceable in Sweden. Id. at 327-28. In denying McLarey’s motion for additional time to seek leave to amend or to seek relief from the court’s order, the district court noted that “[t]his court expects that [the Swedish court] will enter a virtually verbatim Swedish counterpart of [the January 2, 2000,] Order.”

On January 3, 2002, Danaipour submitted a required mirror order motion to the Stockholm District Court. As the District of Massachusetts did not supply a translated order to the Swedish court, Danaipour filed his own translation, which McLarey challenged as inaccurate.

On January 9, 2002, the federal district court amended its order to extend the return date, if McLarey filed an appeal with this court by January 14, until this court could consider her motion for a stay pending appeal. On January 25, this court granted her motion for a stay, staying the order for return of the children pending the outcome of this appeal, and granted the parties an expedited appeal. Argument was heard on March 6, 2002. The last filing from the parties was received on March 29, 2002.

After oral argument, we permitted both parties to supplement the record to provide information regarding later events in Sweden. This supplemental information, as it turns out, tends to weaken further the district court’s assumptions that an equivalent sexual abuse evaluation would occur in Sweden. Even absent the post-hearing information, however, we would reverse the district court. The supplemental information simply confirms for us the errors in the district court’s analysis. On February 14, the Stockholm City Court entered an interim order in this case, which follows some, but not all, of the undertakings ordered by the district court. Danaipour v. McLarey, No. T 3165-00 (Stockholm City Court, Dept. 2, Div. 6, Feb. 14, 2002) (translation). The order stated that “the majority of the conditions imposed by Federal Court for a return of the children under the Hague Convention cannot for formal reason be confirmed.” Id. The City Court noted that in cases such as this one, it was bound to follow the Parental Code and could not lawfully issue an order containing provisions not specified by the relevant provisions of that Code. Specifically, it confirmed the orders that the children should continue to reside with McLarey; that the Child and Youth Psychiatric Service conduct an investigation to “clarify whether [the children] have been exposed to sexual molestation and in that case by whom” (specifying that the Service should report to it on this investigation by May 17, 2002); that the parents will participate in the investigation; and that the results of the investigation “shall be taken into account when the issue of future custody of the children is determined.” Id. It stated that it had no legal authority to confirm the portions of the order requiring McLarey to return the children to Sweden at her own cost, limiting Danaipour’s contact with the children, requiring McLarey to surrender her passport and not leave Sweden without court permission, and requiring that Danaipour not initiate proceedings against McLarey or attempt to enforce custody rights until the court decides otherwise. Id. The Swedish court did, however, revoke the December 14, 2001, order, which had granted Danaipour access to the children.

Pursuant to the Swedish court’s February 14 order, the case was referred to the Child and Youth Psychiatric Service (“BUP”). On March 2, 2002, the BUP informed the Swedish court that it “cannot accept this assignment, whereby we would investigate whether the above-mentioned children have been subjected to sexual abuse and by whom. According to our understanding, whether a crime has been committed and thus a criminal investigation should take place is a question for the police to investigate.”

Following this, Danaipour’s Swedish attorney informed us by affidavit that he contacted the BUP. On March 28, the BUP informed the Swedish attorney that “[w]e undertake ... to investigate whether the above-named children show signs of Post Traumatic Stress Disorder (PTSD) and also, if possible, to explain the underlying reasons. The investigation will be performed at an Open Care Child Psychiatric Clinic within our area of operations.” The letter provided no additional information on the parameters of the proposed evaluation. Further, the evaluation was not to be of whether there was sexual abuse, but rather of the different question of whether the children suffered from PTSD. The letter did not disclaim the statement in the BUP’s March 12 letter to the Swedish court that the BUP would not perform a sexual abuse evaluation.

Danaipour’s Swedish attorney has suggested, apparently as an alternative to a court supervised evaluation, a private evaluation to be done by a child psychiatrist and a Professor of Psychology at Uppsala University. A letter from them states that they have agreed to perform the evaluation at “Hasselby Nya Gard,” to begin at the earliest at the end of April. The letter states that the “questions to be dealt with” are “Have the two girls been traumatized, and what injuries, deficiencies have occurred?” Again, the proposed evaluation was not an evaluation of whether there had been sexual abuse, but rather of whether the children had been traumatized. No information is provided as to the expertise of the proposed evaluators or of the protocols to be followed. Nor is it a court supervised evaluation. There is no evidence that either letter was submitted to the Swedish court.

II.

We review the district court’s interpretation of the Hague Convention de novo. Whallon v. Lynn, 230 F.3d 450, 454 (1st Cir.2000). We review the district court’s factual findings for clear error, id., and review its application of the Convention to the facts de novo, Blondin v. Dubois, 238 F.3d 153, 158 (2d Cir.2001); Feder v. Evans-Feder, 63 F.3d 217, 222 n. 9 (3d Cir.1995).

Under the Hague Convention, children who have been wrongfully removed from their country of habitual residence must be returned, unless the abductor can prove one of the defenses allowed by the Convention. Hague Convention, arts. 12-13, T.I.A.S. No. 11,670, at 7-8; see also Von Kennel Gaudin v. Remis, 282 F.3d 1178, 1182 (9th Cir. 2002). All parties agree that Sweden is the country of habitual residence for purposes of the Hague Convention and that McLarey wrongfully removed the children from Sweden within the meaning of the Convention. See Hague Convention, art. 3, T.I.A.S. No. 11,670, at 4-5 (defining wrongful removal).

McLarey invoked three defenses before the district court: 1) “there is a grave risk that ... return would expose the children] to physical or psychological harm or otherwise place the children] in an intolerable situation,” id. art. 13(b), T.I.A.S. No. 11,670, at 8; 2) that return would be contrary to “fundamental principles of the requested State relating to the protection of human rights and fundamental freedoms,” id. art. 20, T.I.A.S. No. 11,670, at 9; and 3) that the children object to being returned and have “attained an age and degree of maturity at which it is appropriate to take account of [their] views,” id. art. 13, T.I.A.S. No. 11,670, at 8. The district court found that McLarey had failed to meet her burden on all three defenses; McLarey did not pursue the latter two defenses before this court. It is the Article 13(b) grave risk defense on which we decide the appeal.

Under the.United States’s legislation implementing the Hague Convention, a party opposing return based on an Article 13(b) exception bears the burden of establishing that exception by clear and convincing evidence. 42 U.S.C. § 11603(e)(2)(A) (1994). The district court held that subsidiary facts must be proved by a preponderance of the evidence, a standard we accept. Danaipour, 183 F.Supp.2d at 314.

The Convention establishes a strong presumption favoring return of a wrongfully removed child. Whallon, 230 F.3d at 460; see also Turner v. Frowein, 253 Conn. 312, 752 A.2d 955, 970 (Conn.2000) (“[T]he Hague Convention generally favors repatriation as a means of restoring the preabduction status quo and of deterring parents from crossing international boundaries in search of a more sympathetic forum”)- Exceptions to the general rule of expedient return, including Article 13(b), are to be construed narrowly. See Permanent Bureau, Hague Conference on Private Int’l Law, Conclusions and Recommendations of the Fourth Meeting of the Special Commission to Review the Operation of the Hague Convention of 25 October 1980 on the Civil Aspects of International Child Abduction § 4.3 (2001), available at http://www.hcch.net/e/ conventions/reports28e.html; E. Pérez-Vera, Explanatory Report, ¶ 34, at 434 in 3 Hague Conference on Private Int’l Law, Acts and Documents of the Fourteenth Session (1982) (translation of the Permanent Bureau), available at http://www.hcch.net/ e/conventions/menu28e.html [hereinafter Pérez-Vera Report].

The Article 13(b) defense may not be used “as a vehicle to litigate (or relitigate) the child’s best interests.” Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. 10,494, 10,510 (Dep’t of State Mar. 26, 1986); see also Whallon, 230 F.3d at 460. Under Article 13(b), “grave” means a more than serious risk. See Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,510. And even if the conditions for an Article 13(b) exception are met, the Hague Convention gives the court discretion to return the child to the country of habitual residence. Hague Convention, arts. 13, 18, T.I.A.S. No. 11,670, at 8-9; Walsh v. Walsh, 221 F.3d 204, 221 n. 17 (1st Cir.2000); Friedrich v. Friedrich, 78 F.3d 1060, 1067 (6th Cir.1996); Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,509.

The district court here concluded that it required a full independent sexual abuse evaluation in order to make a finding on whether sexual abuse had occurred, and thus whether grave risk of harm would preclude return. Danaipour, 183 F.Supp.2d at 317. It declined to order such an evaluation, however, believing that the determination could be made in Sweden without putting the children at grave risk, so long as certain conditions were met. Id. at 313, 323, 327-28. The district court’s decision that it could defer the ultimate issue of whether sexual abuse had occurred is best evidenced by its statement at the November 1, 2001, scheduling conference:

I’m not necessarily being asked in this case to decide whether the allegations of child abuse are proven by clear and convincing evidence, but I’m being asked to decide whether there are feasible conditions under which the children can be returned to Sweden so the Swedish authorities can decide those issues.

The district court concluded the evaluation could be done as well in Sweden as here. Danaipour, 183 F.Supp.2d at 313, 327. Implicit in this conclusion is a determination that, even if the children had been sexually abused, they could be returned, and the onus would fall upon the Swedish authorities to protect them. Without deciding that there could never be a situation in which a district court could properly decline to make a finding on sexual abuse allegations or defer such a finding to the courts of the country of habitual residence, we hold on the facts and applicable law here that the district court violated the terms of.the Hague Convention.

First, we start with the context in which the grave risk analysis must take place. Of great significance to us is the policy of this country in enforcing the Hague Convention with regard to the type of risk alleged: sexual abuse of a young child. The policy, as articulated by the Department of State, is to view sexual abuse as an intolerable situation. Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,494, 10,510. The district court failed to consider this sufficiently when making its determinations regarding the grave risk analysis and the use of undertakings. Certain statements by the court about what type of conduct and what type of psychological evidence were needed to show grave risk arising out of sexual abuse are inconsistent with United States policy.

Second, the Convention assigns the task of making the “grave risk” determination to the court of the receiving country; here, this task includes the obligation to make any subsidiary factual findings needed to determine the nature and extent of any risk asserted as a defense to returning the child. The treaty does not give the courts of the country of habitual residence jurisdiction to answer the grave risk question; their jurisdiction is determined by the law of their own country. The district court’s implicit determination that, in the circumstances of this case, the children could be returned without first determining whether they had been sexually abused was inconsistent with United States policy with regard to the Hague Convention, which holds that sexual abuse by a parent constitutes an intolerable situation and subjects the child to grave risk. Given the significant evidence of sexual abuse presented here, we believe that it is only after the district court has resolved the sexual abuse issue that the court will be in a position to proceed intelligently down the next avenue of inquiry — whether the children can be returned safely to the country of habitual residence. That is not to say that there may never be a case where it may be proper to defer to the courts of the country of habitual residence a finding on a key factual issue underlying a grave risk determination, but this is not that case: the evidence of sexual abuse so far presented is too serious, and, if the children have in fact been sexually abused, the problem of safeguarding them once they are returned is too great. We hold that the district court erred in determining that the Convention did not require it to determine the issue of sexual abuse.

Third, it has been made clear by later events that the district court erred in deciding that a forensic sexual abuse evaluation could and would be done under the supervision of the Swedish courts. We discuss this more fully in the next section. Nonetheless, and independently of the events in Sweden, we disapprove of the district court’s analytical methodology for two reasons. First, what was at issue was not simply whether the Swedish procedures for conducting a forensic evaluation were adequate, but also the effect of the return on undermining the validity of a sexual abuse evaluation of the children. The district court focused primarily on the first question. The question under the Convention is the effect of return on these particular children, and therefore the focus properly also should have been on whether a forensic evaluation would be viable given these children’s circumstances.

Next, a sexual abuse evaluation in these circumstances ordinarily would be done under the supervision of the court making the grave risk analysis, here the United States court. If, as the father here sought, there is a request that the evaluation be performed in the country of habitual residence, we think that is in the nature of an undertaking. As such, the proponent of the undertaking bore the burden of showing that an equivalent evaluation could be done as well in Sweden. The district court appears, however, to have required McLarey to show that there was a grave risk that an evaluation could not be done in Sweden; that was error in the allocation of evidentiary burdens. We need not decide whether the father met the burden, in light of the developments. We do caution district courts that they must be careful not to prejudice the process of proving grave risk.

Fourth, the district court’s use of conditions went beyond its authority by essentially imposing requirements on a foreign court. That was error. In addition, it also made incorrect assumptions that its own order could and would be enforced by a foreign court.

A. Sexual Abuse as a Grave Risk of Harm

The Article 13(b) exceptions are narrow, and should be construed narrowly by the courts. In this instance, however, some of the district court’s statements evidence an overly restrictive approach to the type of conduct that constitutes sexual abuse, and to the relationship between sexual abuse of a child and grave risk. The policy under the Convention of both the United States government and the Commonwealth of Massachusetts is weighted towards protection of the child when there is credible evidence of sexual abuse, particularly when the child is so young and when the allegations involve abuse by a parent. This policy informs the grave risk analysis.

The United States Department of State’s guidelines on the Hague Convention state that sexual abuse by a parent is an example of an Article 13(b) defense justifying non-return. Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,510. In fact, it is the only example so provided by the Department. The Department of State’s legal analysis states that:

An example of an “intolerable situation” is one in which a custodial parent sexually abuses the child. If the other parent removes or retains the child to safeguard it against further victimization, and the abusive parent then petitions for the child’s return under the Convention, the court may deny the petition. Such action would protect the child from being returned to an “intolerable situation” and subjected to a grave risk of psychological harm.

Id. American policy thus equates sexual abuse with both prongs of the Article 13(b) defense to return of the child: intolerable situation and grave risk. The Department of State’s interpretation of the Convention is entitled to great weight. See Blondin, 238 F.3d at 162 n. 10.

Although its ruling on this point is not entirely clear, the district court seemingly placed too much emphasis on physical assault as an element of sexual abuse, finding that “[t]he credible evidence does not prove that C.D. has been sexually abused physically,” Danaipour, 183 F.Supp.2d at 321 (emphasis added), and that “[a]lthough returning a child who had been raped to the parent who molested her could reasonably be regarded per se as an intolerable situation, this is not such a case,” id. at 325-26.

Sexual abuse other than rape may create an intolerable situation or a grave risk under Article 13(b), particularly when such abuse occurs at the hand of a parent. The Commonwealth of Massachusetts, acting as amicus curiae, has urged us, in light of the district court’s language, to clarify that penetration is simply not a prerequisite to a finding of sexual abuse posing a grave risk of harm to a child. We agree. The Commonwealth states that this is particularly true when the abuser is a parent, stating that “[e]xperts recognize that sexual abuse committed by a parent and unaccompanied by penetration often results in significant trauma,” given the violation of trust inherent in parental sexual abuse. The Commonwealth also notes that its own policy is to prosecute the act of forcing a child to sexually touch an adult as criminal sexual assault. The district court apparently applied a different standard to the significance of this type of abuse; for instance, the court stated in its opinion that the younger child “may have touched [her father’s] penis on one ... occasion.” Id. at 317. The child’s statement to Dr. Luxenberg, however, was that she “squeezed” her father’s penis “and it was very hard,” surely a far more serious occurrence. The district court found Dr. Luxenberg’s testimony credible. Id. at 322. Moreover, there was no evidence that this only happened on one occasion, as the district court said, id. at 317, nor was there evidence that it had happened more than once.

The district court’s language concerning “rape” may also reflect an improper standard. It would be inappropriate to apply any standard that vaginal penetration, but not other types of sexual abuse, would automatically qualify as a grave risk or an intolerable situation. The proper focus is on the effect on the child and whether there is “grave risk of physical or psychological harm or otherwise ... intolerable situation” to which the child would be exposed upon return. This conclusion is supported by the Department of State’s guidr anee, which refers to sexual abuse, not limited to rape or forcible intercourse, in discussing “grave risk.” See Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,510.

In addition, the district court’s opinion placed a great deal of emphasis on its finding that neither child suffered from PTSD, Danaipour, 183 F.Supp.2d at 313, 321-22, 325, which was a major subject of expert testimony at the trial. Although a finding that a child suffers from PTSD and would deteriorate if returned to the country of habitual residence could be evidence tending to support a finding of grave risk under Article 13(b), see, e.g., Blondin, 238 F.3d at 163, a risk of harm arising out of the return to a locale where abuse occurred is a factor that a district court may properly consider in its overall grave risk analysis regardless of the label it bears.

A finding that a child is currently not experiencing severe psychological effects of sexual abuse is not necessarily dispositive; for example, there was significant testimony that sexually abused children often function well as small children, only to experience significant psychological deterioration as they reach puberty. The diagnostic criteria for PTSD, after all, were not developed to reflect children’s psyches. Dr. van der Kolk, who served on the committee to define the diagnostic criteria for PTSD for the most recent edition of the leading psychiatric diagnostic manual (the “DSM-IV”), felt it necessary to “warn the court that [a diagnosis of PTSD] is not a be all and end all in determining whether [the younger child] was seriously harmed by the life circumstances.” He also testified that return to Sweden would be harmful for both girls, although he only diagnosed the younger child with PTSD. This type of evidence has a direct bearing on grave risk determinations in cases where sexual abuse is alleged.

B. Referral of Sexual Abuse/Grave Risk Determination to Swedish Courts

McLarey argues the district court “punted” on its Hague Convention obligations by not deciding the issue of sexual abuse, the basis for her grave risk claim. McLarey also says that the only reason that the children were not properly evaluated in Sweden prior to their wrongful removal was that Danaipour did not give his permission for or cooperate in any such evaluation. Danaipour denies that he stymied the Swedish investigations, but the record supports McLarey’s argument on this point.

Under the text of the Convention, the question for a U.S. court confronted with an Article 13(b) defense is whether “there is a grave risk that the [child’s] return would expose the child to physical or psychological harm or otherwise place the child in an intolerable situation.” Hague Convention, art. 13(b), T.I.A.S. No. 11,670, at 8. It is clear that “a court in the abducted-to nation has jurisdiction to decide the merits of an abduction claim, but not the merits of the underlying custody dispute.” Friedrich, 78 F.3d at 1063; see also 42 U.S.C. § 11601(b)(4) (1994); Hague Convention, art. 19, T.I.A.S. No. 11,670, at 9. The Convention assigns the duty of the grave risk determination to the country to which the child has been removed. It is not a derogation of the authority of the habitual residence country for the receiving U.S. courts to adjudicate the grave risk question. Rather, it is their obligation to do so under the Convention and its enabling legislation. Generally speaking, where a party makes a substantial allegation that, if true, would justify application of the Article 13(b) exception, the court should make the necessary predicate findings. Cf. Whallon, 230 F.3d at 460 (1st Cir.2000) (upholding district court’s findings that father had not verbally abused daughter and that any psychological harm resulting from abuse of mother in that case did not rise to level required by 13(b)).

Implicit in the district court’s decision is a determination that, even if the evaluation requested by McLarey led to a finding that sexual abuse had occurred, McLarey would not be able to meet her burden of showing grave risk upon return. The court found that “[i]n these circumstances, McLarey has not proven by clear and convincing evidence that [either of the children] will be exposed to a grave risk of physical or psychological harm, or otherwise be placed in an intolerable situation, if returned on the conditions the court is ordering.” Danaipour, 183 F.Supp.2d at 325 (emphasis added). The district court did not make a decision on whether Danaipour had sexually abused the children, or take the steps to obtain the evidence it thought necessary to make a reliable finding.

We think there are several errors in the district court’s approach. It is one thing to evaluate whether to return a child once the grave risk occasioned by sexual abuse has been shown. It is another to say, as the district court did, that the child could be returned before it knew whether there was sexual abuse, despite credible evidence that there had been sexual abuse. Secondly, even on its own terms, the court order is based on improper assumptions. As discussed below, the imposition of many of those conditions was erroneous, as was the court’s finding that the Swedish courts would undertake a forensic evaluation. It was based on these errors that the court declined to order the forensic evaluation that it found “would be necessary to determine in a medically reliable manner whether either child was sexually abused in any way.” Id. at 317. The court declined to gather the very information that it found was necessary to make a determination on the key issue. In this case, the trial judge should have made a determination on the underlying question, whether sexual abuse occurred.

The district court’s approach here cuts the inquiry short, in a way that is inconsistent with Hague Convention obligations and United States policy on the Convention, as expressed in the Department of State analysis of grave risk. Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed.Reg. at 10,510. The trial judge should have taken the steps available to him to determine if sexual abuse occurred; only once he had made such a finding could he ask the right questions about whether the children could be returned to the locale of the abuse, where the abuser still resided and where the district court could not guarantee the outcome of future determinations regarding the safety of the children. Similarly, if the evaluation exonerated Danaipour, or even if it was inconclusive, that would also be relevant information to deciding the level of risk, if any, that the girls would face if returned.

C. Location of Forensic Evaluation

The district court accepted the need for further investigation into whether sexual abuse occurred. Danaipour, 183 F.Supp.2d. at 313. The GAL also found that “an evaluation is necessary ... for the protection of the children.” The district court then concluded that such an evaluation could be properly done in Sweden. Id. at 327. We now know the district court was wrong in concluding that a forensic sexual abuse evaluation would be done in Sweden, as discussed in the next section.

We are also concerned about how the district court approached the problem. The focus of the district court’s inquiry was on the adequacy of the Swedish procedures for conducting forensic sexual abuse evaluations. If these procedures had not been adequate, that, of course, would be significant. But even if the procedures were adequate, there still remained the highly relevant question of whether the effect of the return on the children would nonetheless undermine the validity of any examination by making it more likely that the children would not talk to those charged with determining whether or not abuse had occurred.

We give an example from the testimony. In a written report submitted to the court, Dr. Jenny had concluded that the children “should be allowed to stay in the United States for further evaluation. Returning the children to Sweden ... would significantly decrease the likelihood of an effective evaluation of the girls, and would place their safety at risk.” In an affidavit, Dr. Jenny stated:

Given this delay [of four to six months before an evaluation could be performed], given the interruption of what appears to be a trusting therapeutic relationship, and given that during such process [the children] may be returned to both the geographic and physical source of prior trauma, there is a significant likelihood that a dependable, accurate sexual abuse evaluation will not occur and any investigative and/or therapeutic benefits to these children will be jeopardized. This would be true even if any access to their father upon their return were supervised.

At trial, Dr. Jenny was asked her opinion as to whether an evaluation in Sweden could be viable. The court, however, sustained Danaipour’s objection, holding that it lacked foundation as to Dr. Jenny’s “knowledge of circumstances in Sweden, among other things.” McLarey’s counsel attempted to convince the court that the testimony did not go to the situation in Sweden, but rather to the children’s likelihood of disclosure if returned to Sweden. The court then asked whether, given the proposed undertakings, a valid evaluation could be done in Sweden. Dr. Jenny responded that return to Sweden “would set the children back and that they would be given the message that disclosing ... leads to chaos. And ... just being in the environment where abuse may have occurred could certainly affect their perceptions and their ability to communicate what happened to them.”

In contrast, Danaipour’s expert, Dr. Munson, testified only that evaluations are typically done in the child’s home country. He testified that there are some benefits to doing an evaluation there, including “familiarity with the system, familiarity with all of the factors in the locale, the area, what kind of services are available, and even in terms of the culture of the person that has been abused, as well as the culture of the abuser.” However, this testimony goes only to what is generally done, not to what is a reasonable option for these particular children. Moreover, he did not specifically respond to the numerous expert witnesses who had testified that these children would be extremely unlikely to discuss any abuse if they were returned to Sweden. Thus, plaintiffs experts were largely unrebutted on this very material point.

Next, the district court improperly allocated the burdens. The father’s argument that the evaluation be done in Sweden was essentially an undertakings proffer, as to which he bore the burden. We do not know what the district court would have concluded had it properly allocated the evidentiary burdens. In the present posture of the case, we need not decide whether the father met his burden. We further discuss undertakings in the next section.

D. Undertakings

The district court’s findings that the children could be safely returned and that a valid forensic evaluation could be conducted in Sweden relied heavily on its assumption that it could impose enforceable “undertakings,” most notably a requirement that the father not have any contact with his younger daughter unless otherwise ordered by a Swedish court and a requirement that a proper forensic evaluation be conducted in Sweden. Danaipour, 183 F.Supp.2d at 327. The court stated that “McLarey [has not] proven by clear and convincing evidence that the children’s return to Sweden on the conditions being ordered ... will create the grave risk of psychological harm to them that would permit the court to deny Danaipour’s petition,” id. at 313 (emphasis added), and, later, that “[the children] can, on certain conditions, be returned to Sweden without being exposed to a grave risk of physical or psychological harm, or any other intolerable situation,” id. at 327 (emphasis added).

The concept of “undertakings” is based neither in the Convention nor in the implementing legislation of any nation. See P.R. Beaumont & P.E. McEleavy, The Hague Convention on International Child Abduction 156-59 & n. 183 (1999). Rather, it is a judicial construct, developed in the context of British family law. Id.

This court has previously described the utility of undertakings in Hague Convention Article 13(b) cases:

A potential grave risk of harm can, at times, be mitigated sufficiently by the acceptance of undertakings and sufficient guarantees of performance of those undertakings. Necessarily, the “grave risk” exception considers, inter alia, where and how a child is to be returned. The undertakings approach allows courts to conduct an evaluation of the placement options and legal safeguards in the country of habitual residence to preserve the child’s safety while the courts of that country have the opportunity to determine custody of the children within the physical boundaries of their jurisdiction.

Walsh, 221 F.3d at 219 (footnote omitted); see also Feder, 63 F.3d at 226 (noting that court sometimes use undertakings to ensure that the child does not suffer from “short-term harm” if returned). Undertakings can be an important tool for courts to comply with the Convention’s strong presumption of a safe and speedy return of the wrongfully removed child.

At the same time, there are some limits to a court’s ability to use undertakings to avoid an Article 13(b) defense. The court entertaining the petition must recognize the limits on its authority and must focus on the particular situation of the child in question in order to determine if the undertakings will suffice to protect the child.

The Department of State’s view of undertakings, to which we accord great weight, is that they should be limited in scope. The Department’s view of undertakings is expressed in a letter to the British government, written in response to British concerns about American courts’ failure to enforce consistently British undertakings:

[U]ndertakings should be limited in scope and further the Convention’s goal of ensuring the prompt return of the child to the jurisdiction of habitual residence, so that the jurisdiction can resolve the custody dispute. Undertakings that do more than this would appear questionable under the Convention, particularly when they address in great detail issues of custody, visitation, and maintenance.

Letter from Catherine W. Brown, Assistant Legal Adviser for Consular Affairs, United States Dep’t of State, to Michael Nieholls, Lord Chancellor’s Dep’t, Child Abduction Unit, United Kingdom (Aug. 10, 1995), at http://www.hiltonhouse.com/arti-cles/ Undertaking_Rpt.txt [hereinafter Department of State Comment on Undertakings].

There are two basic interrelated issues: 1) international comity; and 2) the appropriateness of undertakings when the abducting parent claims to be protecting the child from abuse.

1. International Comity

When considering possible undertakings, American courts must be sensitive to the need for comity under the Convention between the courts of different nations. The U.S. Department of State, in a legal memorandum attached to the above-cited letter, explained:

Undertakings would appear most consistent with the Convention when designed primarily to restore the status quo ante, or when they impose reciprocal obligations on both the left-behind and the taking parent.... The approach taken by [some] courts, whereby undertakings are reasonably tailored to expedite the return of the child, impose reciprocal obligations on both parents, and explicitly terminate upon action by the court of appropriate jurisdiction, seems entirely appropriate.

Id., attached Legal Memorandum. In the letter, the Department of State listed examples of appropriate undertakings: an agreement that the abducting parents return to the country of habitual residence with the child; assignment of costs for the return flight; and interim custody until a court in the country of habitual residence can arrive at a decision. Department of State Comment on Undertakings, supra. As an alternative to undertakings, the Department of State has suggested “safe harbor” orders, entered by a court in the country of habitual residence at the behest of the left-behind parent, prior to the entry of the return order. Id. Such an approach would avoid the unseemliness of a U.S. court issuing orders for a foreign court to enforce, and the foreign court’s possible noncompliance, both of which occurred here.

There is controversy in the international community as to the use of undertakings. The Special Commission studying the operation of the Hague Convention had this to say with regard to undertakings and safe harbor orders:

On the one hand, “undertakings” are seen as mere proposals agreed upon by the parties and submitted to the requested judge. They are limited in scope to the protection of the child for a limited time and allow the child to be returned sooner, and should therefore be enforced by requesting States as valid under the Convention on the basis of comity. On the other hand, ... undertakings are used too broadly and allow abducting parents to gain significant advantages from the abduction. Furthermore, if such undertakings are mere agreements between the parties, they can be entered into before a judge in the requesting State and thus be incorporated into a “safe harbour” order, which is more readily enforceable. According to some, undertakings incorporated in the return order cannot be enforced as such in the country of habitual residence, short of additional proceedings normally required to recognise foreign judgments.

Permanent Bureau, Hague Conference on Private Int’l Law, Report of the Third Special Commission Meeting to Review the Operation of the Hague Convention on the Civil Aspects of International Child Abduction ¶ 64 (1997), available at http://www.hech.neVe/eonventions/re-ports28e.html. This confirms that undertakings should be limited, and are not themselves binding on foreign courts.

Conditioning a return order on a foreign court’s entry of an order, as the district court did here, raises serious comity concerns. The Department of State has stated that it “does not support conditioning the issuance of a return order on the acquisition of [an] order from a court in the requesting state,” presumably because such a practice would smack of coercion of the foreign court. State Department Comment on Undertakings, supra, attached Legal Memorandum; see also Pérez-Vera Report, supra, ¶ 120 (“[T]he return of the child cannot be made conditional upon [a] decision or other determination being provided [by the court of the country of habitual residence].”)

There are also serious concerns about whether undertakings or safe harbor orders that go beyond the conditions of return are enforceable in the home country. For instance, at least one Massachusetts state court has declined to enforce undertakings entered by a foreign court. See, e.g., Roberts v. Roberts, No. 95-12029-RGS, 1998 U.S. Dist. LEXIS 4089 (D.Mass. Feb. 27, 1998) (discussing Massachusetts state court’s refusal to enforce undertakings entered by court in the United Kingdom), adopted by 1998 WL 151773 (D.Mass. Mar. 17, 1998). Moreover, then-utility in countries where courts lack com tempt or injunctive power is somewhat questionable. See Beaumont & McEleavy, supra, at 166-70. Cf. Blondin, 238 F.3d at 160 (court heard testimony from French law expert that French court would enforce undertakings if they were not contrary to public policy); Panazatou v. Pantazatos, No. 960713571S, 1997 WL 614519, *3 (Conn.Super.Ct. Sept.24, 1997) (court arranged a conference call to a Greek judge to discuss whether undertakings would be honored in Greece).

In this case, there was evidence presented that not all of the proposed undertakings would be enforceable in Sweden, nor would a mirror order suggested by a United States court necessarily be entered by a Swedish court, or be enforceable even if so entered. McLarey’s Swedish legal expert testified that

As a rule, we do not [implement orders entered in foreign jurisdictions]. The Swedish court needs its own basis, its own evidence, its own decision.... And I think in the last court order in this case, the court clearly stated that it will wait for what happened when the children come back to Sweden, and then they will rule again.
... [I]f both parties consent to do [an evaluation], there’s a high probability, but no guarantee, that the court will order such an evaluation.
... [But y]ou can always go back to the court and ask for something else, that you do not consent anymore, and you think its detrimental to the child.... [A]n interim decision can always be turned over.... There is no limit to interim positions in Sweden. And you can also appeal each interim decision by the court.

The expert further opined that, if a parent withdrew from an evaluation, the court would not then have the power to order the parent to participate. McLarey also submitted a report by the U.S. Department of State, indicating that Swedish courts do not have authority to issue contempt orders for violations of visitation orders. See Dep’t of State, Report on Compliance -with the Hague Convention on the Civil Aspects of International Child Abduction (2001), at http://trav-el.state.gov/2000_Hague_Compliance _Re-port.html.

The district court noted that the parties “have agreed to request that the Swedish court enter a ‘mirror order’ imposing any conditions ordered by this court. This court concludes that a Swedish court would do so.” Danaipour, 183 F.Supp.2d at 321. However, the district court’s order is belied both by the expert testimony, described above, and by subsequent events. The day after the United States district court issued its order, Danaipour submitted a motion to the Stockholm District Court stating that “[t]he Swedish court is not acquainted with the concept ‘mirror order.’ Decisions in foreign courts can be directly carried out in Sweden if they concern custody issues and are put forth in the Nordic countries or in [certain other European] countries.” The Swedish court itself, in its February 14 order, stated that it did not have such authority. Danaipour v. McLarey, No. T 3165-00 (Stockholm City Court, Dept. 2, Div. 6, Feb. 14, 2002) (translation).

Indeed, it now appears that the Swedish court lacked the authority to order a full forensic sexual abuse evaluation conducted in keeping with the established protocols for such evaluations, which was a key component of the district ■ court’s order, and there appears to be no mechanism for making such an evaluation a reality. The Swedish agency that the Swedish court charged with conducting the evaluation first stated that it “cannot accept this assignment,” and suggested that the matter should be referred to the Swedish police. The agency has now agreed to conduct an evaluation into whether the children have PTSD, which is different from the evaluation for sexual abuse according to established protocols contemplated by the United States district court’s order. The agency has not disavowed its position that it cannot do a sexual abuse evaluation.

As Danaipour’s expert emphasized in his initial report and in his testimony, there are established protocols within the psychiatric community for investigating whether sexual abuse has occurred. The Child and Youth Psychiatric Service’s proposed evaluation does not directly address the question of sexual abuse — the key issue for our purposes — nor would it follow the protocols devised for investigating that question. It is open to debate, moreover, whether the categorization of PTSD is useful in a child less than four years old, and there was much testimony at trial that children who have been sexually abused often do not exhibit signs of trauma until they are older. Therefore, it appears that there is now little, if any, chance that the district court’s order that a forensic sexual abuse evaluation be done in Sweden will be carried out.

In sum, the district court offended notions of international comity under the Convention by issuing orders with the expectation that the Swedish courts would simply copy and enforce them. The district court had no authority to order a forensic evaluation done in Sweden, or to order the Swedish courts to adjudicate the implications of the evaluation for the custody dispute. See Beaumont & McEleavy, supra, at 161 (“[W]hile a court may find it relatively easy to extract undertakings from an applicant there can be no guarantee that such orders will be enforced in the State of the child’s habitual residence.”). Moreover, its assumption that Swedish courts would enforce the undertakings was both legally and factually erroneous. These undertakings, which the district court believed necessary to protect the children from grave risk, were invalid, and therefore the return order cannot stand for these reasons as well.

2. Undertakings in Context of Abuse Allegations

There is also authority indicating that undertakings should be used more sparingly when there is evidence that the abducting parent is attempting to protect the child from abuse. The Department of State has indicated that:

If the requested state court is presented with unequivocal evidence that return would cause the child a “grave risk” of physical or psychological harm, however, then it would seem less appropriate for the court to enter extensive undertakings than to deny the return request. The development of extensive undertakings in such a context could embroil the court in the merits of the underlying custody issues and would tend to dilute the force of the Article 13(b) exception.

Department of State Comment on Undertakings, supra, attached Legal Memorandum. The Department of State’s guidance on the Convention also supports the conclusion that a court need not consider extensive undertakings when dealing with an Article 13(b) defense based on sexual abuse; the Department says that “[i]f the other parent removes or retains the child to safeguard it against further victimization ... the court may deny the petition.” Hague International Child Abduction Convention: Text and Legal Analysis, 51 Fed. Reg. at 10,510. This analysis implies that the court may deny the return petition on that basis alone, and is not necessarily required to consider ameliorative undertakings. As the Department of State comment on undertakings notes, undertakings are most effective when the goal is to preserve the status quo of the parties prior to the wrongful removal. This, of course, is not the goal in cases where there is evidence that the status quo was abusive.

Leading commentators on the Convention also agree that undertakings should be applied cautiously in these cases:

[T]he imposition of undertakings, albeit rare, does not rest easily with assertions made in relation to Article 13(l)(b)....
Therefore it is submitted that if one of the Article 12 or 13 exceptions is applicable the court should not exercise its discretion to return the child unless enforcement of the undertakings can be guaranteed.

See Beaumont & McEleavy, supra, at 162, 165.

Under the Convention and its implementing legislation, the American courts have a duty to ensure that a child is not returned to a situation of grave risk or an intolerable situation. See Pérez-Vera Report, supra, ¶ 29 (“[T]he interest of the child in not being removed from its place of habitual residence ... gives way before the primary interest of any person in not being exposed to physical or psychological danger or being placed in an intolerable situation.”). Where substantial allegations are made and a credible threat exists, a court should be particularly wary about using potentially unenforceable undertakings to try to protect the child. Undertakings that will protect the child from grave risk for only a very limited time are insufficient to defeat an Article 13(b) claim. See Walsh, 221 F.3d at 218 (“The Convention does not require that the risk be ‘immediate’; only that it be grave.”)

The determination of whether any valid undertakings can be crafted in such a situation is inherently fact-bound. See, e.g., Turner v. Frowein, 253 Conn. 312, 752 A.2d 955 (Conn.2000) (remanding for further consideration of alternative care arrangements and legal safeguards for repatriation of child, where evidence established that father sexually abused child and physically abused mother, home country authorities had failed to respond to mother’s complaints, and home country had no mechanism for a no contact order); Walsh, 221 F.3d 204 (holding district court’s order with undertakings would not sufficiently protect child from violent father who abused mother and regularly ignored court orders). However, the terms of the Convention, as well as the Department of State’s guidance, indicate that the protection of the child must remain paramount.

III.

Conclusion

The district court ultimately did not decide whether the father had sexually abused his younger daughter. Neither do we. Danaipour may be innocent of these accusations.

Accusations that a parent has sexually abused a young child in private are difficult to prove. They are also difficult to disprove. And claims of abuse, whether brought in good faith or for other reasons, are sometimes used as weapons in divorce and custody battles. The trial courts must make nuanced judgments. Still, the evidence here raises, at least, a clear and substantial claim: the treating child psychologist was found to be credible in her recounting of the child’s activities and statements, those activities and statements are evidence of sexual abuse by the father, and the psychologist found no evidence of coaching of the child. Further, there is considerable supporting expert testimony.

On these facts, the district court must adjudicate the issue of whether sexual abuse occurred, ordering further evaluations if necessary in order to determine whether the children are at a grave risk of physical or psychological harm, or of otherwise being placed in an intolerable situation if returned.

The district court decision ordering return is reversed, and the case is remanded for proceedings consistent with this opinion. 
      
      . The Massachusetts Citizens for Children, et al., state that in one study of children who had been sexually abused, only 43% initially made a verbal disclosure of sexual abuse, even to trained sexual abuse investigators.
     
      
      . According to an expert report submitted by McLarey, 75-85% of children who have been sexually abused have normal physical examinations. Amici Massachusetts Citizens for Children, et al., also presented studies indicating that, in most cases of sexual abuse, there is no medical evidence.
     
      
      . McLarey submitted expert testimony concerning the Swedish procedures for investigating allegations of child sexual abuse. One expert report, from a Swedish police officer, stated that it is typical in Sweden for the social services agency to perform a general investigation into the welfare of the child and refer any specific criminal allegations to the police, as was done here. If the police investigation is closed without criminal charges, the social services "ordinarily will not remove the child from a parent's custody, and will often close their own investigations.” The expert further stated that police investigations frequently consist of only one interview of the child in a room at the police station, conducted by an officer with no medical or psychiatric degrees, despite the fact that, in her experience, "children will not make disclosures in this setting, especially at a first meeting.”
      Another report, submitted by a Swedish lawyer specializing in family law with an emphasis on child sexual abuse cases, concurred that the police investigation in this case, consisting of a single short interview of the children conducted by police officers with little training in this area, was typical. The legal expert also stated that social services' investigation "is not geared toward determining whether sexual abuse has actually occurred,” and that social services often closes its investigation once the police investigation is closed. Moreover, the legal expert stated that "[i]t is not uncommon for a [Swedish] court to refuse to grant a request to issue an order that a sexual abuse investigation occur [without the consent of one parent]; typically, such investigations are seen as within the province of police authorities.” Danaipour did not submit any evidence contrary to this.
     
      
      . McLarey argues that the Swedish authority's report is not conclusive on the issue of sexual abuse. Prior to the trial in the United States district court, McLarey submitted expert reports by Dr. Bessel van der Kolk and Dr. Carole Jenny indicating that it is not unusual for young children who have been sexually abused to continue to function normally in school and day care. Dr. Jenny also testified at trial that some children attempt to compensate for the abuse by being extremely well behaved, and that there is no research to indicate that abused children react negatively to an abusive parent. McLarey also submitted several expert reports suggesting that an investigation’s failure to disclose sexual abuse does not necessarily mean that abuse did not occur. One report, prepared by Drs. Glenn Saxe and Wanda Grant Knight, experts in childhood trauma, stated that "it is unusual for traumatized children to disclose abuse to unfamiliar adults.” Danaipour’s expert, Dr. Carlton Munson, testified, however, that children who have been sexually abused typically engage in sexualized behavior, have interpersonal problems, fight with other children, become socially withdrawn, and exhibit developmental regression.
     
      
      . Federal Rule of Evidence 706(a) provides, in relevant part, that:
      The court may on its own motion or on the motion of any party enter an order to show cause why expert witnesses should not be appointed, and may request the parties to submit nominations. The court may appoint any expert witnesses agreed upon by the parties, and may appoint expert witnesses of its own selection.
     
      
      . McLarey submitted the motion two weeks before the end of discovery. Although it would have been helpful if McLarey had made the motion earlier, it appears McLarey believed that, given the disclosures to the treating therapist, she already had sufficient evidence to meet her burden. The December 3 motion for a court-ordered evaluation appears, in part, to have been a response to Danaipour's November 21 initial expert report from Dr. Munson, who expressed concern that Dr. Luxenberg was a treating doctor, not an independent evaluator, and that Dr. Luxenberg had not, in his view, followed established protocols for investigating sexual abuse allegations.
     
      
      . From the translation provided, it is not clear whether this is the child’s right to exercise (or decline to exercise), or whether it is the father’s right.
     
      
      . Upon questioning by the court, Dr. Luxen-berg testified that the older girl suffers from PTSD, as well. Danaipour’s counsel objected to this question and it appears that the court sustained the objection.
     
      
      . The court also stated that "sexual abuse has not been proven,” Danaipour, 183 F.Supp.2d at 325, which seems to be in tension with its finding that Dr. Luxenberg’s testimony was credible.
     
      
      . The district court noted that "[t]he parties have implicitly agreed that [the children] should not be separated.” Danaipour, 183 F.Supp.2d at 324 n. 8.
     
      
      . The affidavit from Swedish counsel mis-characterizes the contents of the attached communications from both the Child and Youth Psychiatric Services and from the University. Further, the affidavit asserts that the "court ordered evaluation will be conducted in Sweden either by the BUP (as originally requested by Ms. McLarey) or by the professionals at Uppsala University.” That statement is not supported by the record.
     
      
      . For instance, amici the Leadership Council for Mental Health, Justice & the Media, et al., stated the generally accepted medical conclusion that "some children may not have immediate and specific reactions that meet the clinical definition of PTSD, yet these children can still be impacted by long-term psychological, personality and somatic disorders.”
     
      
      . The court's erroneous reliance on its conditions as a necessary prerequisite for safe return is discussed below.
     
      
      . This is not to say that a full evaluation must take place whenever an Article 13(b) defense is raised, or even that an inconclusive evaluation by itself would defeat an Article 13(b) claim if there was sufficient other proof of sexual abuse. There may be cases where a trial court is able to find that sexual abuse did or did not occur without the benefit of a full forensic evaluation. On the record here, the court could have concluded that, given the children’s repeated disclosures to the treating therapist and others and the other evidence presented suggestive of sexual abuse, this is such a case. Amici Massachusetts Citizens for Children, et al., agree that a forensic evaluation is not always necessary to establish abuse, and posit that the evidence was sufficient in this case to establish sexual abuse without a forensic evaluation.
     
      
      . Had the district court ordered the evaluation both parties had requested by December 3, 2001, the evaluation would now be complete and the interests of the Hague Convention in a speedy resolution would have been better served.
     
      
      . The children's expressed opposition and anxiety regarding the proposed return to Sweden tends to support the expert testimony indicating that they would view any such return as punishment for disclosures, and would be unlikely to cooperate in any further investigation if returned. Courts may consider the views of a child as evidence in making an Article 13(b) determination as to whether grave risk exists, even if the child is not yet old enough to justify Article 13’s defense for children who "ha[ve] attained an age and degree of maturity at which it is appropriate to take account of [their] views.” See Blondin, 238 F.3d at 166.
     
      
      . The opinion was shared by Dr. van der Kollc, whose report indicated that switching therapists at this point "could have a marked impact on their ability and willingness to communicate the source of their trauma.” At trial, Dr. van der Kolk testified that the likelihood, if the children were returned, is that "they will clam up and will not say anything.” The GAL, referring to the younger daughter, also noted that "it is going to be a challenge for any evaluator or counselor to effectively work with the child in the coming several weeks or months.” The trial court expressed its reasons for discounting this testimony at 183 F.Supp.2d at 323, n. 7.
     
      
      .In this case, Danaipour’s objections prevented the Swedish authorities from conducting a full evaluation. McLarey requested an investigation twice from the Swedish social services, then from the Swedish Child and Youth Psychiatric Service, and then from the Swedish court. On each of these occasions, Danaipour did not agree to allow a sexual abuse investigation.
     
      
      . A referral to the police is not the type of evaluation required or contemplated by the United States district court. The testimony at trial was that the Swedish police conducted an investigation in the winter of 2000-2001, an investigation which all experts agreed did not comport with established protocols for sexual abuse evaluations. Moreover, McLar-ey's experts on the Swedish police and legal system indicated that Swedish police investigations do not typically follow established sexual abuse protocols, testimony that Danaipour did not refute at trial.
     