I’ll begin with the possibly shocking admission that I’m a strong supporter of the collection of ideas and techniques known as evidence-based medicine (EBM). I’m even the current President of the Evidence-Based Veterinary Medicine Association (EBVMA). This may seem a bit heretical in this context, since EBM takes a lot of heat in this blog. But as Dr. Atwood has said, “we at SBM are in total agreement…that EBM “should not be without consideration of prior probability, laws of physics, or plain common sense,” and that SBM and EBM should not only be mutually inclusive, they should be synonymous.” So I have hope that by emphasizing the distinction between SBM and EBM and the limitations of EBM, we can engender the kind of changes in approach needed to address those limitations and eliminate the need for the distinction. One way of doing this is to critically evaluate the misuses of EBM in support of alternative therapies.
One of the highest levels of evidence in the hierarchy of evidence-based medicine is the systematic review. Unlike narrative reviews, in which an author selects those studies they consider relevant and then summarizes what they think the studies mean, which is a process subject to a high risk of bias, a systematic review identifies randomized controlled clinical trials according to an explicit and objective set of criteria established ahead of time. Predetermined criteria are also used to grade the studies evaluated by quality so any relationship between how well studies are conducted and the results can be identified. Done well, a systematic review gives a good sense of the balance of the evidence for a specific medical question.
Unfortunately, poorly done systematic reviews can create an strong but inaccurate impression that there is high-level, high-quality evidence in favor of a hypothesis when there really isn’t. Reviews of acupuncture research illustrate this quite well.
Acupuncture is one of the most studied practices in complimentary and alternative medicine (CAM), and this means there is a large volume of research to evaluate. While one might expect this to be a good thing, making it easier to tell whether acupuncture is effective for any specific medical problem, the amount of research studies actually makes for muddy waters in which the truth about the clinical efficacy of acupuncture is difficult to discern. The more studies there are, the greater the chance of getting some positive results even for an ineffective therapy. If the quality or methodology of the studies is poor, the results will be unreliable. And if numerous such studies of questionable quality exist, it becomes easier to generate systematic reviews which appear to provide high-level supporting evidence that doesn’t actually mean what it looks like it means.
For example, a recent systematic review of the use of acupuncture for pain following stroke appeared in the Journal of Alternative and Complementary Medicine.
Jung Ah Lee, Si-Woon Park, Pil Woo Hwang, Sung Min Lim, Sejeong Kook, Kyung In Choi, and Kyoung Sook Kang. Acupuncture for Shoulder Pain After Stroke: A Systematic Review. The Journal of Alternative and Complementary Medicine. September 2012, 18(9): 818-823.
The conclusion seems quite promising; “It is concluded from this systematic review that acupuncture combined with exercise is effective for shoulder pain after stroke.” Given that a systematic review is high-level evidence, this ought to provide us with a fair degree of confidence that acupuncture is useful for this problem.
But a more detailed look casts a bit of doubt on this conclusion. For one thing, 453 studies were identified and only 7 met the quality criteria for inclusion. This suggests that, even in the eyes of acupuncture researchers, most acupuncture research is lousy. And the 7 studies that were chosen for evaluation were all conducted and published in China and all showed positive results. Their results may have as much to do with how research is conducted and published in China as with the efficacy of acupuncture for this problem.
While there is no question that some great scientific research is done in China, there is evidence for a systematic problem with the conduct and publication of alternative medicine studies there. Studies reported as randomized are most often not actually properly randomized. And one review in 1998 found that no negative study of acupuncture had ever been published in China. This strongly suggests that the acupuncture literature coming from China is unreliable due to poor methodological quality and a high risk of publication bias.
A review of systematic reviews published in the same journal as the review of acupuncture for shoulder pain also supports a skeptical interpretation of the first paper.
Bin Ma, Guo-qing Qi, Xiao-ting Lin, Ting Wang, Zhi-min Chen, and Ke-hu Yang. Epidemiology, Quality, and Reporting Characteristics of Systematic Reviews of Acupuncture Interventions Published in Chinese Journals. The Journal of Alternative and Complementary Medicine. September 2012, 18(9): 813-817.
These authors identified and evaluated systematic reviews of acupuncture research published in China and these were their findings:
Results: A total of 88 SRs were identified; none of the reviews had been updated. Less than one third (27.3%) were written by clinicians and one third (35.2%) were reported in specialty journals. The impact factor of 53.4% of the journals published was 0. Information retrieval was not comprehensive in more than half (59.1%) of the reviews. Less than half (36.4%) reported assessing for publication bias. Though 97.7% of the reviews used the term “systematic review” or “meta-analysis” in the title, no reviews reported a protocol and none were updated even after they had been published after 2 or more years.
Conclusions: Although many SRs of acupuncture interventions have been published in Chinese journals, the reporting quality is troubling. Thus, the most urgent strategy is to focus on increasing the standard of SRs of acupuncture interventions, rather than continuing to publish them in great quantity. This suggest that most systematic reviews of acupuncture published in China don’t search the literature thoroughly and don’t evaluate it properly. Given existing evidence that much of the research being reviewed is itself questionable, there is ample reason to be suspicious of the conclusions of such systematic reviews.
When supporters of acupuncture claiming to follow the principles of evidence-based medicine cite systematic reviews, there is a strong possibility that these reviews don’t actually fairly present the balance of the evidence. If they are poor quality reviews based on a biased sample of questionable studies, then they can only serve to create an inaccurate impression of the efficacy of acupuncture.
And there are systematic reviews of the systematic reviews for acupuncture which have found that the balance of the evidence does not suggest a benefit from acupuncture: “In conclusion, numerous systematic reviews have generated little truly convincing evidence that acupuncture is effective in reducing pain.” A large number of studies makes it possible to generate high-level evidence both for and against a hypothesis, in this case concluding both that acupuncture does and does not relieve pain. That only further clouds the issue since naturally everyone cites those reviews which support their a priori position on acupuncture.
Another way of evaluating the state of the evidence on a given intervention is to compare the quality of studies with the likelihood of positive results. Dr. R. Barker Bausell has reviewed the acupuncture this way in his book Snake Oil Science. As it turns out, the highest-quality studies of acupuncture consistently find acupuncture works no better than placebo and that using fake needles and even jabbing the skin in random places with toothpicks work just as well as “real” acupuncture. Lower quality studies are more likely to be positive. This too sheds doubt on the reliability of positive clinical trials.
As supporters of acupuncture will undoubtedly point out, this doesn’t prove acupuncture doesn’t work in those conditions for which systematic reviews have stated it does work. It does show, however, that a lot of time, energy, and money has been spent on acupuncture research without generating a consistent body of evidence that can support it or justify any great confidence.
Which raises the issues of plausibility and prior probability, often cited as the primary sources of contention between between SBM and EBM. In theory, I do not object to clinical trial testing of interventions without well-established theoretical foundations. As Sir Austin Bradford Hill, one of the early luminaries of clinical epidemiology, put it, “What is biologically plausible depends upon the biological knowledge of the day.” As CAM proponents delight in pointing out, sometimes wacky ideas prove true.
What they often fail to acknowledge, though, is that science does a pretty good job of accommodating such surprises if they can prove themselves through rigorous testing. The theory that Helicobacter could cause duodenal ulcers was considered implausible when proposed in 1982, and it won a Nobel prize for the proponents of the idea in 2005. That’s a pretty quick acceptance of an initially controversial idea, and it’s not consistent with the caricature of mainstream science as closed-minded and dogmatic.
In the real world, however, crazy ideas are far more likely to turn out to be wrong than revolutionary. Dr. Sanden’s Electric Belt was at least as wacky as the idea that bacteria cause ulcers, but it has faded into history without any recognition from the Nobel committee. When time, money, and talent are limited (and they always are), spending them on ideas unlikely to bear fruit is hard to justify.
While a perfect world might allow for thorough, methodical testing of every possible practice, in this world we owe it to our patients to focus our energies on those ideas most likely to result in real help for them, those ideas which build on established knowledge rather than asking us to ignore or overturn it.
Finally, some sort of reasonable limit on the time and resources committed to investigating an idea is needed. When an adequate effort has been made and a strong, consistent body of evidence has failed to emerge, it is time to move on.
In the case of acupuncture, the original theoretical mechanisms invoked to explain why it should help (Ch’i, meridians, and so on) are vitalistic and inconsistent with established science. Attempts to find alternative mechanisms have yielded some interesting information about physiology and the mediation of pain sensation, but they have not turned up a coherent, unified theory of action supported by good evidence. And enormous numbers of clinical trials have been done over decades, again without yielding a consistent body of evidence supporting a specific therapeutic effect for acupuncture beyond the placebo effects of the therapeutic ritual.
So determining the truth about acupuncture requires more than simply looking for published systematic reviews. The quality of these reviews, and the studies they evaluate, must be critically appraised and the evidence at all levels, not simply clinical trials, must be considered. Finally, the proposed mechanisms by which acupuncture might work must also be critically evaluated to see if they are supported by good evidence and are not strongly at odds with established scientific knowledge. It is a misuse of evidence-based medicine to simply conduct poor quality systematic reviews on poor quality trials with a high risk of bias and then take the conclusion of these reviews at face value. A more comprehensive look at the question and the evidence at all levels is required. This is what is meant by science-based medicine, and it is what good evidence-based medicine should be.