Editor’s note: Because of Dr. Gorski’s appearance at CSICon over the weekend, he will be taking this Monday off. Fortunately, Dr. Coyne will more than ably substitute. Enjoy!
NIH is funding free training in the delivery of the Cancer to Health (C2H) intervention package, billed as “the first evidence-based behavioral intervention designed to patients newly diagnosed with cancer that is available for specialty training.” The announcement for the training claims that C2H “yielded robust and enduring gains, including reductions in patients’ emotional distress, improvements in social support, treatment adherence (chemotherapy), health behaviors (diet, smoking), and symptoms and functional status, and reduced risk for cancer recurrence.” Is this really an “empirically supported treatment” and does it reduce risk of cancer recurrence?
Apparently the NIH peer review committee thought there was sufficient evidence fund this R25 training grant. Let’s look at the level of evidence for this intervention, an exercise that will highlight some of the pseudoscience and heavy-handed professional politics in promoting psychoneuroimmunological (PNI) interventions.
The report of the single study (full article available here) evaluating the efficacy of this intervention for physical health outcomes appeared in the American Cancer Society journal Cancer in in 2008. An earlier report (full article available here) claimed to demonstrate the effects of the intervention on the “secondary outcomes” of mood, immune function, health behaviors, and adherence to cancer treatment and care.
The abstract of the 2008 Cancer article described the group intervention as a set of strategies to “reduce stress, improve mood, alter health behaviors, and maintain adherence to cancer treatment and care.” The abstract reported not only a reduced risk of cancer recurrence but proclaimed “psychological interventions as delivered and studied here can improve survival.” If this intervention indeed improved survival, it is curious that the claim was not echoed in the advertisements for this training program.
When the article first came out, I did a simple chi-square calculation on the raw recurrence and death events in a pair of 2×2 cross tabulations of outcomes for intervention versus control group. No matter how I played with the data in figure 2, group differences nowhere near approached significance. Here is the online calculator and below are the data in Table 2 so that you can experiment for yourself (click to enlarge):
My colleagues and I decided to take a close look at the reports on this trial and write a commentary to be submitted to Cancer. We took the position that claims about reducing risk of recurrence and extending the survival of breast cancer patients are medical claims that should be held to the same standards as claims about medications and medical devices improving health outcomes. These standards include consistency between the abstract and findings reported in the results section of an article, pre-specification of one or two primary outcomes and follow up period, pre-specification of analytic plan, presentation of results in a way that allowed readers to evaluate the appropriateness of choice and interpretation of statistical tests. The latter would include transparent presentation of unadjusted primary outcomes in analyses for time by treatment interactions and avoidance of substitution of secondary and subgroup analyses.
To help ensure the standards are met, most biomedical journals have embraced CONSORT as the standard for reporting results of both clinical trials and, more recently, for abstracts. Many journals also require publicly accessible preregistration of trials for the publishing of later results, i.e., that investigators declare ahead of time their intentions for sample sizes, outcomes, and analyses, before they run the first patient. These standards are enforced less consistency with psychosocial trials, and reregistration was not in place at the time this clinical trial was implemented in the mid-90s. However, by the time these papers were published in 2004 and 2008, it had already been established that not meeting the CONSORT reporting standards involved a high risk of bias and unreliability of results. And investigators do not need the coaxing of CONSORT standards for abstracts to presume that abstracts should accurately reflect the results reported in the rest of the article.
When we submitted the commentary to Cancer, it was initially rejected, with the editor citing a standing policy of not accepting critical commentaries if authors refused to respond. We asked the editor to re-evaluate the policy and reconsider the rejection of our commentary. We argued that the policy was inconsistent with the growing acceptance of the necessity of post publication peer review. Essentially the policy allowed authors to suppress criticism of their work, regardless of the validity of criticism. Furthermore, our commentary presented not only a critique of the article, it called attention to a failure in editorial review that was worthy of note in itself. We therefore requested that we be allowed to expand our commentary substantially beyond the strict word limitations of correspondence about a particular study. After a meeting of the editorial board, the editor graciously accepted our requests.
In the commentary, we pointed out that the trial did not report significant tests for unadjusted outcomes and gave no rationale for the particular follow up period of 11 years (7 to 13) in which progression or deaths were recorded. Investigators committing themselves to a particular observation period ahead of time prevents post hoc shrinking or extension of the observation period to get more favorable results based on a peeking at the data. Nonetheless, we could find no significant differences in the proportion of women experiencing recurrence or dying, despite claims of the investigators to the contrary. Furthermore, the difference in median time to recurrence, six months, or to death, was small, given the length of observation period.
How were the investigators able to claim significant effects? By relying on dubious multivariate analyses with too high a ratio of covariates to events (recurrence or death). I’ll leave much of the technical statistical arguments to the commentary, but basically, the investigators’ approach had a high risk generating spurious effects. It’s always reassuring when results for simple unadjusted primary outcomes in a randomized trial hold up after adjustments for possible confounds, although the rationale for undertaking control of any initial differences between groups is unclear because randomization is itself supposed to take care of any. When results are not obtained in simple unadjusted analyses, but then show up in multivariate analyses, the suspicion is that they are spurious, because results of multivariate analyses are often dependent on arbitrary decisions about which covariates to include and how to score them, decisions that can be made and revised based on peeking at the data. We should be particularly suspicious when, as is the case in this trial, too many covariates are entered as controls.
We went on to critically examine the earlier study of psychosocial measures, adherence, and immune function.
Andersen BL, Farrar WB, Golden-Kreutz DM, et al: Psychological, behavioral, and immune changes after a psychological intervention: A clinical trial. Journal of Clinical Oncology 22:3570-3580, 2004
The abstract of this article reported testing the hypothesis “psychological intervention can reduce emotional distress, improve health behaviors and dose-intensity, and enhance immune responses.” The results presented in the abstract were uniformly positive in terms of effects on anxiety and improved dietary habits, smoking, and adherence, with no negative results mentioned.
When we examined the actual methods section, we found at least nine measures of mood, eight measures of health behavior, four measures of adherence, and at least 15 measures of immune function were assessed. There was no independent way of determining which of these measures represented the primary outcome for each domain. With so many outcomes examined, there was high risk of obtaining apparent effects by chance.
Turning to the actual results, only one of the 9 measures of mood was significant in time by treatment interactions. The intervention seemed to have a significant effect on dietary behavior (although it is unclear why the seemingly very different dietary behaviors were not analyzed) and smoking, but no effect on exercise. As is often the case with early breast cancer patients, rates of adherence to chemotherapy were too high to allow any differences between intervention and control group to emerge. In terms of immune function, results were not significant for CD3, CD4, CD8 counts cell counts, or six assays of natural killer cell lysis. If we compare this overall pattern of results to what was stated in the abstract, we see a gross confirmatory bias in the suppression of negative results and highlighting of positive ones.
Subsequent papers from this project amplified the confirmatory bias of these two papers by declaring a reduced risk of recurrence and death from breast cancer for intervention participants and gains for all secondary outcomes. These papers also cast doubt on whether the 2004 paper disclosed all of the outcome measures that were assessed. One article stated that for the subgroup of patients with elevated Center for Epidemiologic Studies-Depression (CES-D) scores, the intervention reduced depressive symptoms. This outcome is not even mentioned in earlier reports, but these subgroup analyses seem to imply that a reduction in depressive symptoms did not occur for the full sample. It is a reasonable inference that this null finding was suppressed in earlier reports. CES-D scores would seem to be the preferred primary measure of mood outcome for such studies. The CES-D has validated clinical cut points, and it is commonly believed that depression is the mood variable most strongly related to immune function. Another article referred to the Beck Depression Inventory (BDI), also an excellent candidate for a primary outcome in a study attempting to affect recurrence and survival via links between psychological variables and the immune system.
Our close reading of the results reported in these two articles suggests that the intervention is inert with respect to mood and immune function, and has no effect on progression and survival. The intervention is hardly ready for dissemination into the community. The designation of this intervention in advertisements for the free training as “the first evidence-based behavioral intervention designed to patients newly diagnosed with cancer” is premature and exaggerated. What could be meant by “evidence based”? Claims of “robust and enduring gains” in all categories of outcomes are simply wrong.
My colleagues and I gave our now familiar argument that there was lack of evidence that any psychosocial intervention could reduce risk of recurrence and improve survival. There was also a lack of evidence for possible mechanisms by which such effects could conceivably be achieved.
Cancer published our commentary without a response from the authors because they continued to refuse to provide one. Our commentary was instead accompanied by a response from Peter Kaufmann, MD. We wondered why the choice came down to Dr. Kaufmann and why he would accept the offer to reply to us. He had not written much about cancer, but he is Deputy Chief of the Clinical Applications and Prevention Branch of National Heart Lung and Blood Institute (NHLBI) and at the time his commentary was written, he was President of the Society of Behavioral Medicine.
The Cancer to Health (C2H) intervention package is based on the assumption that psychological variables have clinically significant effects on physical health via the immune system. Despite the lack of support for this idea with respect to cancer, the idea remains highly attractive and resistant to rejection because it lends prestige to psychosomatic and behavioral medicine. At the annual meeting of SBM at which Dr. Kaufmann became president, the keynote address was delivered by David Spiegel and basically involved debating in absentia skeptics and critics of the notion that psychosocial intervention could extend survival of cancer patients, including me. I complained to Dr. Kaufmann that if Spiegel wanted to debate me, I should have been invited to respond. Kaufmann indicated that I would get an invitation for keynote in the future to remedy this imbalance, but the occasion never materialized.
Subtitling his commentary “To Light a Candle,” Kaufmann conceded that my colleagues and I had raised valid criticisms about the design and interpretation of the C2H intervention trial. However, he took issue with our recommendation that clinical trials of this kind be suspended until putative mechanisms could be established by which psychological variables could influence survival. Quoting our statement that an adequately powered trial would require “huge investments of time, money, and professional and patient resources,” he nonetheless called for dropping a “preoccupation with mechanisms and secondary aims,” and instead putting the resources to increasing the sample size and quality of an intervention trial.
I remain puzzled by Kaufmann’s argument. In the absence of specification of a mechanism by which psychological variables could have such an effect, was Kaufmann nonetheless suggesting that we needed a large trial to overcome the lack of power of the moderate sized C2H trial? I cannot imagine a NIH administrator making a similar argument for a large scale study of an herbal product or coffee enemas or other intervention with a similar undocumented mechanism of influence.
Barbara Andersen, the principal investigator on both the C2H trial and the grant for trained professionals in delivering the intervention, has never responded in print to our criticisms and charges that the trial does not affect progression or survival. However, she has complained to administrators of the institutions of a number of her critics and asked that they put a stop to behavior having negative ramifications for the field of behavioral research in cancer. She also campaigned unsuccessfully to get another critique of a work that we published retracted.
It is unlikely that NIH showed favoritism in funding the training grant, relying instead on scores obtained in peer review. Reviewers must have been swayed by the consistent confirmatory bias in presentation of the results of C2H. However, there is a bias in NIH supported forums given to claims about psychosocial interventions affecting physical health outcomes. Andersen and those making similar claims regularly get invited to annual NIH sponsored symposia at professional meetings and reiterate the claims again and again. Apparently, there’s no room for critics on such panels.
The two papers presenting the outcomes of C2H have inaccurate abstracts and data analytic strategies that hide that they are basically null trials. In this respect they are not alone. Elsewhere I have documented that other psychosocial trials [1,2,] conducted by PNI investigators would be revealed to be null trials if time x treatment interactions were transparently reported for primary outcomes. Here is what to look for:
- Ignoring of negative results in main analyses of primary endpoints.
- Favor secondary analyses, subgroup analyses, and endpoints developed post hoc over negative findings for primary analyses.
- Positive spin to abstract, highlighting best of results obtained using these strategies.
- Negative findings presented as if positive in subsequent publications.
PNI cancer researchers take a Texas sharpshooter approach to identifying positive effects for immunological variables. The apocryphal Texas sharpshooter drove drunk around Texas with his rifle and a can of red paint and shot up the sides of buildings. Afterwards, he would draw a bull’s-eye with some of his hits in the center, creating the impression of an expert marksman who always hit his mark. PNI researchers similarly collect numerous PNI measures, not on the basis of their known association with cancer, but based on their ease of assessment. Measures derived from saliva samples are particularly popular. Investigators then declare whatever measures prove significant as evidence that they have tapped into the PNI of cancer. Further, they claim to replicate existing studies, when existing studies obtained significant effects with different measures. Any positive result obtained with a battery of measures is declared a replication, even when it is not a precise replication.
Compared to cancer, behavioral interventions in HIV+/AIDS have the advantage of well validated mechanisms by which behavioral interventions might conceivably influence the immune system, and in turn, readily measurable assessments of any clinically significant impact. This area has attracted considerable interest from PNI researchers, who, similar to cancer PNI researchers, praise their own and each other’s success in modifying clinically relevant immunological parameters. But a meta-analysis of 35 randomized controlled trials examining the efficacy of 46 separate stress management interventions for HIV+ adults (N = 3,077) tells a different story”
To our surprise, we did not find evidence that stress reduction interventions improve immune functioning or hormonal mechanisms that could influence immunity. These findings contrast with the PNI perspective that guided our work and most of the interventions included in our review (Antoni, 2003; Robinson et al, 2000). Thus, even though chronic stressors are known to suppress both cellular and humoral markers (see Segerstrom & Miller, 2004) the short-term use of stress-management strategies does not seem to reverse these processes in patients with HIV.
PNI cancer researchers remain a self-congratulatory group with a strong confirmatory bias in their mutual citations of the field’s claimed successes. Judging by citation patterns in the incestuous journal Brain, Behavior, and Immunity, one can readily get the impression that there are never any negative studies in the PNI cancer literature.
The articles reporting results for the C2H trial continue to be highly cited, with little apparent effect of our criticism. With a lack of other positive trials that can be cited, particular importance in the PNI literature has been attached to the claims that C2H extend survival of cancer patients. There is apparently little concern about conveying unrealistic expectations to patients concerning effects of psychosocial intervention on their immune system, and these claims fit with patents’ impressions and motivations for going to peer support groups and group therapy.
Cancer patients sometimes faced difficult choices about medical interventions to manage their disease. It is unfortunate if they are provided with misinformation that all they need to do is get stress management interventions to slow progression and extend their survival. Belief that these interventions are effective can discourage them from committing themselves to more effective, but painful, fatiguing, and disfiguring medical interventions.