StATS: More on the Emily Rosa experiment (created 2006-03-10).

One of the more interesting research studies from an Evidence-Based Medicine perspective started out as a simple science fair project by a fourth grade student. Emily Rosa wanted to see if practitioners of Therapeutic Touch could detect the energy fields in a carefully controlled condition. The topic of this project was not too surprising, since her parents both worked for the QuackWatch website, but Emily came up with the idea entirely on her own. The science project received a lot of publicity and Emily was encouraged to publish here results in a medical journal. With the assistance of several adults, the publication,

appeared, giving Emily Rosa something nice to put on her resume when she applies to college. I'm still waiting for my first publication in an "A journal" like JAMA, so I am quite jealous.

Here are a few excerpts from the publication:

In 1996 and 1997, by searching for advertisements and following other leads, 2 of us (L.R. and L.S.) located 25 TT practitioners in northeastern Colorado, 21 of whom readily agreed to be tested. Of those who did not, 1 stated she was not qualified, 2 gave no reason, and 1 agreed but canceled on the day of the test.

The reported practice experience of those tested ranged from 1 to 27 years. There were 9 nurses, 7 certified massage therapists, 2 laypersons, 1 chiropractor, 1 medical assistant, and 1 phlebotomist. All but 2 were women, which reflects the sex ratio of the practitioner population. One nurse had published an article on TT in a journal for nurse practitioners.

There were 2 series of tests. In 1996, 15 practitioners were tested at their homes or offices on different days for a period of several months. In 1997, 13 practitioners, including 7 from the first series, were tested in a single day.

The test procedures were explained by 1 of the authors (E.R.), who designed the experiment herself. The first series of tests was conducted when she was 9 years old. The participants were informed that the study would be published as her fourth-grade science-fair project and gave their consent to be tested. The decision to submit the results to a scientific journal was made several months later, after people who heard about the results encouraged publication. The second test series was done at the request of a Public Broadcasting Service television producer who had heard about the first study. Participants in the second series were informed that the test would be videotaped for possible broadcast and gave their consent.

During each test, the practitioners rested their hands, palms up, on a flat surface, approximately 25 to 30 cm apart. To prevent the experimenter's hands from being seen, a tall, opaque screen with cutouts at its base was placed over the subject's arms, and a cloth towel was attached to the screen and draped over them (Figure 1).

Each subject underwent a set of 10 trials. Before each set, the subject was permitted to "center" or make any other mental preparations deemed necessary. The experimenter flipped a coin to determine which of the subject's hands would be the target. The experimenter then hovered her right hand, palm down, 8 to 10 cm above the target and said, "Okay." The subject then stated which of his or her hands was nearer to the experimenter's hand. Each subject was permitted to take as much or as little time as necessary to make each determination. The time spent ranged from 7 to 19 minutes per set of trials.

To examine whether air movement or body heat might be detectable by the experimental subjects, preliminary tests were performed on 7 other subjects who had no training or belief in TT. Four were children who were unaware of the purpose of the test. Those results indicated that the apparatus prevented tactile cues from reaching the subject.

The odds of getting 8 of 10 trials correct by chance alone is 45 of 1024 (P=.04), a level considered significant in many clinical trials. We decided in advance that an individual would "pass" by making 8 or more correct selections and that those passing the test would be retested, although the retest results would not be included in the group analysis. Results for the group as a whole would not be considered positive unless the average score was above 6.7 at a 90% confidence level.

A nice graphic showing the experimental setup is on the web at

and I am going to try to get permission to use this figure on one of my web page.

The results of the experiment were not good. In the original series of trials, only 70 of the 150 guesses were correct (47%, 95% CI 37% to 57%). In the second series, only 53 of 130 guesses were correct (41%, 95% CI 32% to 50%).

 I talk about this study on several places on these web pages:

I mostly repeat the same point, which is that this experiment, while possibly oversimplifying the conditions under which, does still provide convincing evidence against Therapeutic Touch.

When I used this example in a talk that I gave this week, someone asked an interesting question. Even though the average performance was very poor, was it possible that some of the participants performed very well? After all, you only need one person who can reliably detect energy fields to prove that there is something interesting going on.

This is sometimes called the "White Crow" argument, which was based on the quote

To upset the conclusion that all crows are black, there is no need to seek demonstration that no crows are black; it is sufficient to produce one white crow; a single one is sufficient. William James, as quoted at www.prairieghosts.com/piper.html.

There is some plausibility to this quote, and it is often used in an attempt to salvage an otherwise negative research finding. As the website above notes, if you find one honest Spiritualist medium, that refutes the charge that all mediums are fakes. Perhaps so, but a reasonable person should also infer that in any area that is rife with fraud, you should be extremely careful in examining the evidence. As the number of documented fraudulent cases increases, the standard of proof should become increasingly high for any future claim made by a spiritual medium.

Not knowing the individual results of the Emily Rosa study off the top of my head, I did point out that any discipline where the average practitioner performs more poorly than a coin flip is a discipline that cannot be trusted in general. In particular, I would demand that a practitioner of Therapeutic Touch demonstrate their ability in a blinded experiment before I would consider giving them money for their services.

Going back today and re-reading the article, I noticed that the authors set up individual thresholds of 8 correct responses out of 10 trials as being an interesting individual result that would warrant replication. The authors note that

Only 1 subject scored 8, and that same subject scored only 6 on the retest.

which pretty much demonstrates that none of the 21 people tested could reliably detect energy fields in a carefully controlled condition.

The reaction of the Therapeutic Touch community was either to attack the research or ignore it. The attacks were quite vicious, actually, and unwarranted. It reminds me of a quote by someone else confronted with convincing evidence from a carefully controlled experiment that discredited his fervently held beliefs:

You see, that is why we never do double-blind testing anymore. It never works! as quoted on www.quackwatch.org/01QuackeryRelatedTopics/ideomotor.html.

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Critical appraisal.