Statistical Evidence. Preface.

There's a story* about two doctors who are floating above the countryside in a hot air balloon. They are drifting with the wind and enjoying the scenery, but after a couple of hours, they realize that they are totally lost. They see someone down on the ground, and shout down "Hello! Can you tell us where we are?"

The person on the ground replies, "you're fifty feet up in the air, in a hot air balloon.

One doctor turns to the other and says, "That person on the ground must be a statistician."

"How did you know?" came astonished reply from the ground.

"Only a statistician would provide an answer that was totally accurate and totally useless at the same time."

In my stories, of course, the statistician always has the last word.

"Very good. But I can also tell that you two are doctors."

It was the doctors' turn to be astonished. The statistician explained. "Only a doctor would have such a good view of the area and still not have any idea they were."

If you are a doctor or any other health care professional, you have such a good view of the research. There are thousands of medical journals that publish hundreds of research articles each year. But with all that information, it is still difficult for you to know what is going on.

Several years ago, I became very interested in understanding how health care professionals made decisions. How did they choose which new therapies and treatments to adopt? When did the evidence in favor of a new practice become compelling enough to get them to drop an old and ingrained way of practicing their craft?

It's not an easy question to answer. Medical professionals who cling stubbornly to what they learned  in school are not doing their job well. But adopting willy nilly any new trend that comes along would make things even worse.

If you have ever agonized about whether to change your practice on the basis of a new research study, this book is for you. Is a research study definitive, or is it an interesting finding that needs replication? I can help answer this question. Not that I can better gauge the quality of the evidence, but because I can help you ask the right questions. Was there a good control group? Did the researchers study the right patients? Did they measure the proper outcomes?

How did this all get started?

The original inspiration for this book came from the students in an informal class I was teaching at Children's Mercy Hospital in 1997. In a survey, I asked the students why they were taking the class. My hope was that this information would help me select future topics for discussion. A common response was along the lines of "I want to understand the statistics used in medical journal articles." So I prepared a talk called "How to Read a Medical Journal Article." I expanded the talk into a web page (www.childrensmercy.org/stats/journal.asp).

Some of the original material that inspired this book can still be found there, as well as in a weblog that I started in 2004 (www.childrensmercy.org/stats/weblog.asp). 

Around the same time, I had the good fortune of being invited to write a series of articles about research for the Lab Corner section of the Journal of Andrology. This allowed me to further refine these ideas.

My other inspiration came from the invitations I got to participate in several journal clubs at Children' Mercy Hospital. The journal articles were always interesting and the discussions helped me polish the ideas that I am presenting here.

Outline of this Book

The Introduction documents some of the weaknesses in published research that you need to be aware of. Some of you don't need any convincing that much of the research being published has serious limitations. This is where I make my case that  you should worry more about how the data was collected rather than how it was analyzed. I also stress the importance of critical thinking.

"Apples or Oranges?" examines the quality of the control group. How carefully the control group was selected and handled relates to credibility of the research. If you want a technical term, this is often called the internal validity of the research.

"Who Was Left Out?" considers exclusions before the study started, and exclusions during the study. If important segments of the population are left out, then you may have difficulty generalizing the results of the study. This is often called the external validity of the research.

"Mountain or Molehill?" examines the clinical relevance of the outcome. The outcome measure has to be properly collected and has to measure something of interest to your patients. The size of the study has to be large enough to produce reasonably precise estimates and the difference between the treatment and control group has to be large enough to have a clinical impact.

"What do the other witnesses say?" discusses how to look at additional corroborating evidence outside the journal article itself. Corroborating evidence is especially important for observational studies, because it is rare that a single observational study provides definitive results entirely by itself. Rather, it is a collection of observational studies, all looking at the problem from a different perspective that can provide persuasive evidence. This section is loosely based on the nine factors to assess a causal relationship that Sir Bradford Hill developed in 1966.

"Do the pieces fit together?" applies the same principles of statistical evidence to meta-analyses and systematic overviews. Study heterogeneity, study quality, and publication bias are serious threats to the validity of a systematic overview.

"What do all these numbers mean?" gives a non-technical explanation for some of the statistics used in hypothesis testing, such as p-values and confidence intervals. It also explains the various measures of risk, like the odds ratio, relative risk, and number needed to treat.

"Where is the evidence?" gives a brief overview of how to search for research articles. The first step is to structure your question carefully using the PICO format. Then you should start with high level sources first, sources that include summaries and systematic overviews. These are better than using  PubMed or the Internet, which often offer too much information for you to properly synthesize. If you do need to use PubMed or the Internet, though, I offer some tips for refining your search.

Who is this book for?

I am writing this book for any health care professional who is making the effort to read and evaluate medical publications. Do you update and modify your clinical practice on the basis of what you read in the research journals? I have guidelines that can help you.

Non medical professionals can also benefit from this book. I do use a few technical medical terms, but as long as words like "myocardial infarction" don't give you a heart attack, you will be just fine. Indeed, many people like me who do not have specialized medical training will still read medical journals. Journalists, for example, have to write about the peer-reviewed literature for the public and they need to know when researchers are overhyping their research findings. Lawyers involved with malpractice suits need to understand which medical practices have been supported by medical research, which practices have been discredited, and which practices still require additional research. More and more patients want to research their own diseases so they can discuss treatment options intelligently with their doctors.

And while I focus mostly on medical examples, the general principles apply to other areas as well. If you work in a non-medical field, but you read peer-reviewed journals and try to incorporate their findings into your job, my guidelines can help you.

I did not write this book to teach you how to conduct good research. I wrote it for consumers of research, not producers of research. Even so, when you plan your research you should try to use a research design that is most likely to be persuasive. To that extent, my book can help.

There are several things I am quite proud of in this book.

Extensive use of real world examples. There is a lot of fascinating research papers out there, and they tell an intriguing story. These papers pose interesting questions like "what sort of person would volunteer to have a spinal tap done as part of a research study" and "why would a doctor flip a sterilized coin in the operating room?" I have included hundreds of citations in this book, and many of these examples have the full text on the web for free.

Focus on statistics issues. When you are trying to assess the quality of a medical publication, most of the issues touch directly on Statistics. And yet, Statistics is the one area that medical professionals are intimidated by. Well, Statistics isn't brain surgery, and you are capable of understanding the concepts.

Avoidance of formulas and technical language. People think that Statistics is a bunch of numbers and formulas, but there are a lot of non-quantitative issues in how statistics are applied in research. When you are trying to assess the credibility of a research study, these non-quantitative concerns are far more important than any formulas or statistical calculations.

Acknowledgements

I could not have written this book without the hard work of  my administrative assistant, Linda Foland, who has tamed a massive database of almost 5,000 bibliographic entries. She also has applied her sharp editorial eye to the web pages that eventually morphed into this book that you are now reading. Linda was preceded by two other very capable administrative assistants, Carla Liebentritt and Meg Goodloe, who have deservedly gone on to bigger and better things, but who were of immense help while I had the privilege of working with them.

Alison Jones at Oxford University Press has been great to work with. She has patiently guided me along the process, and has tolerated many slipped deadlines.

All of the "Own Your Own" exercises as well as the graphs and figures that you see in this book come from papers published by Biomed Central under the open access license. This license allows you flexibility to use to copy and display the work or any derivative work as long as you cite the original source. I have to thank the authors who are brave enough to try this publication model, as it makes it so much easier to produce my web pages and this book.

I also have learned a lot from the participants of various Internet email discussion groups (especially edstat-l, epidemio-l, evidence-based-health, irbforum, and stat-l), who have shared their wisdom with the me and the rest of world. My meager contributions to these groups can only be a small and partial repayment for all the things that I have learned.

Thanks also go to the doctors, nurses, and other health care professionals where I work at Children's Mercy Hospital helped keep me on my toes by asking difficult questions that forced me to think hard about the process of research. Thanks to all of you, my job is a constant intellectual challenge.

Most of all, I have to thank my wife, Cathy, who has always provided support and encouragement throughout the entire process. Cathy, your unwavering belief in me gave me the spark to persevere.

Footnotes

*I can't claim credit for this joke. It has been running around the Internet in various forms for years. Do a web search on the words "joke hot air balloon" for some examples.

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. It was written by Steve Simon on 2005-06-03, edited by Steve Simon, and was last modified on 2008-11-25. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Statistical evidence