For my first blog entry, I wanted to write about something important, and I couldn’t think of anything more important than a recent book by R. Barker Bausell: Snake Oil Science: The Truth About Complementary and Alternative Medicine. If you want to understand how medical research works, if you want to know what can lead patients and scientists to false conclusions, if you have ever used complementary or alternative medicine or have wondered why others do, if you value evidence over belief, if you care about the truth, you will find a treasure trove of information in this book.
Some of the treatments encompassed under “complementary and alternative medicine” (CAM) have been around for a long time. Before we had science, “CAM” was medicine. Back then, all we had to rely on was testimonials and beliefs. And even today, for most people who believe CAM works, belief is enough. But at some level, the public has now recognized that science matters and people are looking for evidence to support those beliefs. Advocates claim that recent research validates CAM therapies. Does it really? Does the evidence show that any CAM therapy actually works better than placebos? R. Barker Bausell asks that question, does a compellingly thorough investigation, and comes up with a resounding “NO” for an answer.
Bausell is the ideal person to ask such a question. He is a research methodologist: he designs and analyzes research studies for a living. Not only that: he was intimately involved with acupuncture research for the National Center for Complementary and Alternative Medicine (NCCAM). So when he talks about what can go wrong in research and why much of the research on CAM is suspect, he is well worth listening to.
He describes his acupuncture research in great detail. It involved patients with pain from dental surgery. Before designing the experiments, he searched the literature and found an article that reviewed 16 previous trials of acupuncture for dental pain and concluded that it was probably effective. But on the Jadad scale, a simple 5-point measure of quality, none of those 16 studies scored higher than 3 (which is considered barely adequate) and 5 of them incredibly scored zero. Bausell’s group set out to resolve the question with research of much higher quality. For instance, a low dropout rate is one measure of quality; they only had 3 subjects drop out during the course of the study, and those 3 were people the researchers sent home because of a snow storm!
They compared “true” acupuncture to the most credible “sham” acupuncture they could devise. There was no difference in outcome: both were equally effective in relieving pain. When they looked more closely at their data, they found some surprises. The placebo control was not perfect, and some subjects had been able to guess which group they were in. Knowing you really got acupuncture should have increased the placebo response, and knowing you didn’t should have decreased it – yet even so, there was no difference between the groups. So the results were even more negative than they appeared. Even more fascinating, patients who thought they got real acupuncture reported much more pain relief than those who thought they got the sham, regardless of which they actually got!
Bausell points out that penicillin cures pneumonia even if you’re in a coma, but alternative medicine only seems to work when you are awake. You have to know (or think) you’re being treated. And penicillin works by well-understood scientific principles, while much of alternative medicine is based on “entire physiologic systems or physical forces that the average high school science teacher already knew didn’t exist.” If any alternative treatment clearly worked as well as penicillin, prior plausibility wouldn’t matter: science would adopt it and worry about how it worked later. Under the circumstances, prior plausibility is an important consideration.
He tells his mother-in-law’s story. She had knee pain from osteoarthritis with fluctuating symptoms. Every time the pain increased, she would try something new she had read about in Prevention magazine and every time it would seem to work as the pain naturally decreased again. And eventually it would seem to stop working as the pain naturally increased again. She would phone every couple of months to tell him about the wonderful new treatment she had discovered. She was not ignorant or stupid, but she underestimated the power of the placebo and didn’t realize how the natural fluctuations of her pain led her to false conclusions.
She had fallen for the most common human error: the post hoc ergo propter hoc fallacy. The fact that pain relief follows treatment doesn’t necessarily mean that the treatment caused the pain relief. This is only one of the many impediments to correct thinking that plague our fallible human brains. Bausell describes some of those other impediments. He shows how patients, doctors, and researchers are all equally likely to fool themselves, and why the most rigorous science is needed to keep us from reaching false conclusions.
Bausell’s thorough discussion of the placebo phenomenon is illuminating and invaluable. He covers the history of research on placebos and tells some fascinating anecdotes. He argues that placebo response is not just imagination. It is a learned phenomenon, a conditioned response. You respond to a placebo pill because you have previous experience of being helped by pills. Morphine injections in dogs cause a side effect of salivation: after a while, you can inject water and they will respond with salivation. Physiologic effects from placebo are always smaller than with the real thing, but apparently they do occur. The evidence for objective physiologic effects may not be entirely convincing, but it is certain that pain and other subjective symptoms respond to placebos. And there is even research suggesting a mechanism: the release of endogenous opioids, pain-relieving chemicals produced by our own brains. If you counteract those chemicals with a narcotic antagonist like Narcan, you can block the placebo response.
He shows that the act of taking a pill may really relieve pain, but that the contents of the pill may be irrelevant. Research shows a hierarchy of placebo response: injections work better than capsules and capsules work better than tablets. The color and size of the pill and the frequency of dosing all make a difference. And intriguingly, patients who have responded to a placebo have distortions of memory: they remember the pain relief as greater than it actually was! Bausell points out that
“…just because someone with a PhD or an MD performs a clinical trial doesn’t mean that the trial possesses any credibility whatsoever. In fact, the vast majority of these efforts are worse than worthless because they produce misleading results.”
The book includes valuable lessons on how to tell credible research from the other kind. Even the most experienced researchers will find food for thought here, and for the layman it will be a revelation.
Research is full of pitfalls. Negative studies tend not to get published (the file drawer effect). Research done by believers and pharmaceutical companies tends to be more positive than research done by others. Studies from non-English speaking countries are notoriously unreliable for various reasons – 98% of the acupuncture studies from Asia are positive, compared to 30% from Canada, Australia, and New Zealand. The researcher may delegate the actual research to others, who may make undetected mistakes or deliberately skew results to please their boss. Double blind studies may not be truly blind: subjects may have been able to guess which group they were in. Subjects who are not responding may drop out. People who believe in homeopathy are more likely to volunteer for homeopathy studies. Researchers may put a positive spin on their findings or reach conclusions that are not justified by the data. Even if the research is impeccable, we arbitrarily use p=.05 as the measure of statistical significance, and this means there is a 5% probability that the results will appear falsely positive just by chance. There are more pitfalls, and Bausell covers them all.
When you come right down to it, no experiment is beyond criticism, and most published research is wrong. So how can we decide which studies are credible? We now have published guidelines such as the 22 item Consolidated Standards of Reporting Trials (CONSORT) checklist to assess the quality of randomized controlled trials, but Bausell offers some simpler criteria that can rule out the worst offenders:
- Subjects are randomly assigned to a CAM therapy or a credible placebo
- At least 50 subjects per group
- Less than 25% dropout rate
- Publication in a high-quality, prestigious, peer-reviewed journal
Using this simple 4-item checklist, he reviewed all the CAM studies published in The New England Journal of Medicine and The Journal of the American Medical Association from 2000-2007. 14 met the criteria, and all were negative. When he expanded his search to include the Annals of Internal Medicine and Archives of Internal Medicine, he ended up with 22 studies, only one of which was positive (exactly what you would expect from the 5% rule if none of them worked).
Since different studies have conflicting results, we now use meta-analyses or systematic reviews to try to reach a reliable conclusion. In fact, we even have systematic reviews of systematic reviews! After explaining why systematic reviews are subject to several pitfalls of their own, Bausell goes on to examine the high-quality systematic reviews from the Cochrane collection. Cochrane’s independent reviewers take the quality of studies into account and try to evaluate all the published evidence without bias. Of 98 CAM reviews in the Cochrane database, 21 were positive. When he subtracted those that lacked confirmation by studies in English-speaking countries, those with suspect controls, and those that were subsequently trumped by more definitive high-quality studies, the percentage of positive studies dropped to that familiar 5%.
A highly touted non-Cochrane review of homeopathy concluded that the clinical effects of homeopathy were not just due to placebo. But strangely they also concluded that there was insufficient evidence to show that any single homeopathic treatment was clearly effective in any one clinical condition. A re-analysis of the studies they had reviewed showed that when only the highest quality studies were considered, the alleged positive effect for homeopathy disappeared.
What all this amounts to is that advocates can point to plenty of “snake oil” science that apparently supports various CAM treatments; but when examined critically, the entire body of evidence is compatible with the hypothesis that no CAM method works any better than placebo. True believers will never give up their favorite treatment because of negative evidence; they will always want to try one more study in the hope that it will vindicate their belief. They see science as a method they can take advantage of to convince others that their treatment works. They don’t see it as a method of finding out whether their treatment works. Bausell says,
“CAM therapists simply do not value (and most, in my experience, do not understand) the scientific process.”
He doesn’t try to tell us what to think, but he educates us in how to think critically about medical claims and about medical research. He doesn’t aim to dissuade anyone from using CAM. He just doesn’t want anyone to choose it for the wrong reasons, to be fooled into thinking there is credible evidence where there isn’t. He emphasizes that CAM nourishes hope, and its placebos work, if only for symptoms that would eventually resolve on their own anyway. The comfort CAM brings can be valuable, as long as it is not used in place of effective treatments for serious conditions – and most of the time it isn’t, despite the occasional horror story of a patient who refuses effective cancer treatment and dies using a worthless remedy. Bausell ends his book with advice on how to choose an effective placebo therapy!