I’ve frequently noted that one of the things most detested by quacks and promoters of pseudoscience is peer review. Creationists hate peer review. HIV/AIDS denialists hate it. Anti-vaccine cranks like those at Age of Autism hate it. Indeed, as a friend of mine, Mark Hoofnagle, pointed out several years ago, pseudoscientists and cranks of all stripes hate it. There’s a reason for that, of course, namely that it’s hard to pass peer review if you’re peddling pseudoscience, although, unfortunately, with the rise of “integrative medicine,” it’s nowhere near as difficult as it once was.
Be that as it may, peer review, the process by which scientific papers are evaluated by scientific “peers” to look for problems with the science and decide if the paper is appropriate for publication in a scientific journal, is a concept that dates back hundreds of years. However, for the most part, before the middle of the 20th century, the ultimate determination of whether a paper was appropriate for scientific publication was made by editors or editorial committees. Opinions of external reviewers were sometimes sought when deemed appropriate by journal editors, but by no means was this the practice for most manuscripts. Over the last six or seven decades, external peer review by scientists chosen by the journal editor evaluating a submission has become the standard. Similarly, decisions regarding whether or not to fund grant applications are now generally made by a panel of external reviewers. In the case of the NIH, these panels are called study sections and consist of scientists with expertise in the types of applications being referred to the study section for evaluation, along with (usually) a statistician or two and officials from the NIH who take care of organizing and running the meetings of the panel. The scientific members of a study section usually include “permanent” members, who are assigned to fixed terms on the study section, and ad hoc members, called in for one or a few meetings as needed and deemed necessary by the NIH.
I’ve not infrequently stolen the words from one of Winston Churchill’s speeches to describe our current peer review system:
Many forms of Government have been tried and will be tried in this world of sin and woe. No one pretends that democracy is perfect or all-wise. Indeed, it has been said that democracy is the worst form of government except all those other forms that have been tried from time to time.
Simply substitute the words “scientific evaluation” for “Government” and “peer review” for democracy, and you get my drift. Peer review is, like many any system devised by human beings, imperfect. Scientists know that it is not perfect or all-wise. Indeed, scientists probably complain about peer review more than anyone else, because we have to deal with it many times a year, either as applicants, authors, or peer reviewers. Of course, to the pseudoscientists and quacks we routinely discuss here, peer review is viewed as the equivalent of the Cerberus guarding the gates of Hades (the Underworld) preventing the spirits of the dead from escaping, except in this case, “escape” means breaking out of the crank journals and bottom-feeding “pay-to-publish” open access journals and getting their work published in a real scientific journal. OK, OK, I know it’s not a perfect metaphor, but peer review isn’t a perfect process; so I’ll use it anyway. Besides, if you’re a scientist trying to get a paper published and have had to deal with clueless peer reviewers, the image of the peer review process as a giant three-headed dog has undeniable appeal, given that most scientific papers are assigned to three reviewers.
I’ve been thinking about writing a post about peer review at least since August, but, since then, something always seemed to manage to catch my interest when Sunday blogging time rolled around. (Truly, I am Dug the Dog when it comes to blogging.) I figured the topic would keep for another week. Then, last week’s New England Journal of Medicine featured a Perspective article by Charlotte Haug entitled “Peer-Review Fraud — Hacking the Scientific Publication Process“, complete with an accompanying interview with her. I’m sure I’ll be seeing this article featured on quack websites very soon. That is, what we call in the biz, an “in.” So I dusted off the list of web pages I had been carefully hoarding at least since summer. Let’s dig in.
Hacking Cerberus
Quacks love it when scientists complain about peer review because they think that those complaints validate their conspiratorial belief system about “close-minded” scientists trying to “suppress” their views. Of course, our pointing out the shortcomings of the peer review system are generally intended as a starting point from which either to improve the existing system or to discuss potential alternative systems to replace it, not as agreeing that pseudoscience should be published in scientific journals. We know that science depends on transparency and honesty; if those are compromised, the trustworthiness of science itself can be compromised. In any event, beginning a few months ago, advocates of various pseudoscientific forms of medicine started circulating certain articles quoting them, citing these articles as evidence that science is irretrievably corrupt, broken, biased, or close-minded (take your pick of any or all), the implication being, of course, that their preferred form of quackery has legitimacy but is being unfairly excluded from the scientific literature by the peer review process.
In her article, Haug notes a disturbing trend in peer review, specifically peer review evaluations that are outright fraudulent. Noting that in August, Springer retracted 64 articles from ten different journals “after editorial checks spotted fake email addresses, and subsequent internal investigations uncovered fabricated peer review reports.” Later, BioMed Central, also owned by Springer, retracted 43 articles for exactly the same reason, Haug notes:
“This is officially becoming a trend,” Alison McCook wrote on the blog Retraction Watch, referring to the increasing number of retractions due to fabricated peer reviews.2 Since it was first reported 3 years ago, when South Korean researcher Hyung-in Moon admitted to having invented e-mail addresses so that he could provide “peer reviews” of his own manuscripts, more than 250 articles have been retracted because of fake reviews — about 15% of the total number of retractions.
How is it possible to fake peer review? Moon, who studies medicinal plants, had set up a simple procedure. He gave journals recommendations for peer reviewers for his manuscripts, providing them with names and e-mail addresses. But these addresses were ones he created, so the requests to review went directly to him or his colleagues. Not surprisingly, the editor would be sent favorable reviews — sometimes within hours after the reviewing requests had been sent out. The fallout from Moon’s confession: 28 articles in various journals published by Informa were retracted, and one editor resigned.3
When I first found out about “fake” peer review, I had a hard time believing it. The main reason I was so incredulous was because I couldn’t believe that journal editors would be so clueless as to let something like this happen. After all, most peer reviewers work at either university or government facilities; if you see, for example, a manuscript submission with suggested peer reviewers with Gmail, Hotmail, or Yahoo! accounts (or any account not using the domain name of the university or institution for where that peer reviewer works), you’d think that would at least raise a red flag to look a bit more closely. Yes, I know that some scientists might use their home e-mail addresses, but at the very least a non-university or non-institutional e-mail address should lead the editor to take a closer look. Most scientists’ e-mail addresses are locatable through their university’s website or by looking up the most recent papers they’ve published as corresponding author; in the case of industry it’s more difficult but not insurmountable.
Unfortunately, given how relatively easy it is (or should be) to detect the kind of fake peer reviewers mentioned above, some researchers have become more sophisticated in their peer review fraud:
Peter Chen, who was an engineer at Taiwan’s National Pingtung University of Education at the time, developed a more sophisticated scheme: he constructed a “peer review and citation ring” in which he used 130 bogus e-mail addresses and fabricated identities to generate fake reviews. An editor at one of the journals published by Sage Publications became suspicious, sparking a lengthy and comprehensive investigation, which resulted in the retraction of 60 articles in July 2014.
It goes beyond even a researcher creating his own “peer review and citation ring.” There exist companies that offer manuscript preparation services to authors. Many are reputable and exist to help with editing and figure preparation. Some provide ghost writing services. Others, as this Committee on Publication Ethics (COPE) statement reports, offer services that include fabricated contact details for peer reviewers to be used during the submission process plus reviews from these fabricated addresses. COPE notes that some of these “peer reviewers” have “the names of seemingly real researchers but with email addresses that differ from those from their institutions or associated with their previous publications” and that “others appear to be completely fictitious.” COPE notes that it’s not clear how much the authors of manuscripts submitted using such services know, specifically whether they know that the reviewer names and e-mail addresses are fraudulent. My response to this is “Oh, really?”
It goes beyond even this, though, as a more detailed report of Hyung-In Moon’s and Peter Chen’s fraud documented in Nature. Moon and Chen both exploited a flaw at the heart of Thomson Reuters’ ScholarOne, a publication-management system used by quite a few publishers. Again, it’s a flaw so unbelievably obvious that, in this era of concern about identify theft and cyber-crime, it’s incredible that this is how ScholarOne works:
Moon and Chen both exploited a feature of ScholarOne’s automated processes. When a reviewer is invited to read a paper, he or she is sent an e-mail with login information. If that communication goes to a fake e-mail account, the recipient can sign into the system under whatever name was initially submitted, with no additional identity verification. Jasper Simons, vice-president of product and market strategy for Thomson Reuters in Charlottesville, Virginia, says that ScholarOne is a respected peer-review system and that it is the responsibility of journals and their editorial teams to invite properly qualified reviewers for their papers.
So, if an editor agrees to use one of the author’s fake suggestions, that author is allowed into the ScholarOne system and create whatever identity he wants as a registered “peer reviewer” in the system. Unfortunately, ScholarOne isn’t the only system with such glaring vulnerabilities. Another system, Editorial Manager, does something no halfway well-designed system in 2015 should be doing:
Editorial Manager’s main issue is the way it manages passwords. When users forget their password, the system sends it to them by e-mail, in plain text. For PLOS ONE, it actually sends out a password, without prompting, whenever it asks a user to sign in, for example to review a new manuscript. Most modern web services, such as Google, hide passwords under layers of encryption to prevent them from being intercepted. That is why they require users to reset a password if they forget it, often coupled with checking identity in other ways.
Yes, I’ve experienced this very thing as a reviewer for journals using Editorial Manager. Even so, to me the Nature article is misguided in that it seems to harp on vulnerabilities in the various computer software platforms used by publishers to manage submissions and peer review a bit too much and on the true flaw that allows self-peer review to occur a bit too little. Don’t get me wrong. Technological and security problems are serious. After all, no software should make it so easy for fake reviewers to be entered into the system, and no software should be sending out passwords in regular e-mail in plain text. However, the true problem that facilitates fraud of this sort lies less within the software used than within the system that uses the software. Even so, Nature‘s list of “red flags” that “you just might be dealing with fake peer reviewers if…” is rather simple. One is even amusing:
- The author asks to exclude some reviewers, then provides a list of almost every scientist in the field.
- The author recommends reviewers who are strangely difficult to find online.
- The author provides Gmail, Yahoo or other free e-mail addresses to contact suggested reviewers, rather than e-mail addresses from an academic institution.
- Within hours of being requested, the reviews come back. They are glowing.
- Even reviewer number three likes the paper.
Number four amuses me just based on my own behavior. I rarely complete peer reviews in less than three days, and frequently I’m so busy that I’m late, such that the editorial software is sending me reminders. As for number five, this gives you an idea of why that’s downright funny:
Yes, “reviewer number three” is notorious for being the one whose criticisms of a submitted manuscript are the most—shall we say?—pointed.
The fox guarding the henhouse?
It should be quite clear from the discussion above that the real practice that facilitates peer review fraud is the way that many journals ask authors for names of suggested peer reviewers and then actually use those names. I’ve always wondered about this myself, because, after all, at the very minimum, no one’s going to suggest a peer reviewer who’s likely to trash the paper being submitted. Even leaving aside the possibility of fake peer reviewers, using peer reviewers suggested by an author makes it far more likely that the review will be less rigorous and far more likely to recommend publication with few changes. After all, scientists are only human. If they’re asked to pick their own peer reviewers, of course they’re going to pick ones that maximize their chances of getting published and minimize their chances of having to do multiple revisions and more experiments to satisfy reviewers’ criticisms.
Readers who aren’t scientists and haven’t dealt with peer review before might reasonably wonder: Why on earth do editors do this? Haug lists three reasons:
- In highly specialized fields, authors may actually be the best qualified to suggest suitable reviewers for the manuscript in question.
- It makes life easier for editors because finding peer reviewers can be difficult, given that it’s unpaid work that can be quite demanding.
- Journals and publishers are becoming increasingly multinational, which means that it’s become more difficult for editors and members of editorial boards to be familiar with all the scientists throughout the world working on a topic.
These all sound very reasonable, but for them to be valid reasons to use author-recommended reviewers there have to be trust, honesty, and transparency because, as Steve Novella pointed out discussing this issue, scientists are human beings and some proportion of human beings will always cheat to gain an advantage. That can never be completely eliminated. However, any good system with an incentive for cheating (and, make no mistake, there are major incentives for scientists to publish in good journals, as such publications can make their careers and provide evidence of productivity to be used in grant applications) should implement processes to make cheating more difficult and the price of being discovered cheating more costly.
It seems to me that, at the very minimum, the era of asking scientists for suggestions for peer reviewers for their own manuscripts must end. The reasons why many (but by no means all) journals have done so for so many years are quite understandable but no longer defensible in the wake of these damaging and large scale incidents of self-peer review fraud. This practice must stop, even at the price of more work for already harried editors. One technological solution that might help would be a database of peer reviewers, each with his or her relevant field of expertise listed, as well as collaborators and those with whom they’ve published, so that editors can know not to send a manuscript to an author’s friend or collaborator for review. In the wake of these scandals, it might even be profitable for a company to develop such a database and sell access to publishers. Lacking a system like this, it will fall on the shoulders of editors to be more careful and to pick peer reviewers themselves, rather than using any recommendations by authors submitting manuscripts.
Is the peer review system a “sacred cow” that needs slaughtering?
All this brings me back to the title of this post, which is based on a quote from Richard Smith, former editor of the British Medical Journal (now The BMJ). Speaking at a Royal Society meeting in April, Smith characterized the peer review system as a “sacred cow” ready to “slaughtered.” As you can imagine, that particularly juicy quote went down quite well among those who are less than enamored with science-based medicine, such as Robert Scott Bell and, of course, Mike Adams’ minion Ethan Huff at NaturalNews.com, who twisted Smith’s quote to read ‘Sacred cow’ of industry science cult should be slaughtered for the good of humanity, BMJ editor says.
Of course, what these accounts neglected to mention was that Smith made his quotes in the context of a debate with Georgina Mace, professor of biodiversity and ecosystems at University College London, with Smith taking the “anti-” position and Mace taking the “pro-.” Thus, it might not be surprising for each debater to take a more extreme position. For instance, Smith actually characterized John Ioannidis’ famous 2005 paper “Why most published research findings are false” as meaning that “most of what is published in journals is just plain wrong or nonsense,” which is clearly not what Ioannidis was saying. Just because something turns out to be incorrect does not make it nonsense in the context of the time, and, in fact, Ioannidis was making an argument that prior plausibility has to be taken into account in doing and interpreting research studies, which is a key argument for science-based medicine.
Still, Smith did make some good points, particularly when he described a BMJ experiment in which a brief paper was sent to 300 reviewers with eight deliberate errors introduced into it. No reviewer found more than five; the median was two, and 20% didn’t spot any. Of course, I would counter that this observation is not an indictment of peer review as a process, but rather evidence that BMJ under Smith’s editorship didn’t pick its peer reviewers very well and that peer review needs improvement. Perhaps, instead of scrapping peer review, we should work to improve it.
Fix it, don’t dump it
Fixing peer review is more the approach taken by Richard Horton, who is the current editor-in-chief at The Lancet and who published an article around the same time entitled Offline: What is medicine’s 5 sigma? Of course, whenever I hear Horton pontificate about peer review, it’s hard for me not to remember that he was also the editor of The Lancet under whose regime Andrew Wakefield published his execrable 1998 Lancet case series that was has been used to blame autism on the MMR vaccine for nearly 18 years. Still, after discussing the problems with research and peer review, Horton does make some decent points. Perhaps he has learned from l’affaire Wakefield:
Can bad scientific practices be fixed? Part of the problem is that no-one is incentivised to be right. Instead, scientists are incentivised to be productive and innovative. Would a Hippocratic Oath for science help? Certainly don’t add more layers of research red-tape. Instead of changing incentives, perhaps one could remove incentives altogether. Or insist on replicability statements in grant applications and research papers. Or emphasise collaboration, not competition. Or insist on preregistration of protocols. Or reward better pre and post publication peer review. Or improve research training and mentorship. Or implement the recommendations from our Series on increasing research value, published last year. One of the most convincing proposals came from outside the biomedical community. Tony Weidberg is a Professor of Particle Physics at Oxford. Following several high-profile errors, the particle physics community now invests great effort into intensive checking and re-checking of data prior to publication. By filtering results through independent working groups, physicists are encouraged to criticise. Good criticism is rewarded. The goal is a reliable result, and the incentives for scientists are aligned around this goal. Weidberg worried we set the bar for results in biomedicine far too low. In particle physics, significance is set at 5 sigma—a p value of 3 × 10–7 or 1 in 3·5 million (if the result is not true, this is the probability that the data would have been as extreme as they are).
I always love it when physicists suggest such a strategy, given how much more variability is inherent in biological and medical research, so much so that very few experiments ever reach that level of significance statistical significance. Still, I could see decreasing the p-value for “statistical significance” to 0.01 or even 0.001. I could even see eliminating the p-value altogether, together with using Bayesian reasoning to estimate the probability that a given result is correct.
There is value in some of Horton’s other suggestions. Certainly, one problem is that, as much as we scientists want to do a good job at peer review, the fact remains that peer review is unpaid and, from an academic standpoint, doesn’t really contribute much to our career advancement. For instance, when going up for promotion, assistant professors do have to show evidence of scholarly activity, such as peer review, but peer review is of low value in that equation compared to other activities. Publishing a single peer-reviewed paper in a decent journal is worth more than reviewing dozens of papers for journals, and a single NIH grant is worth more than reviewing any conceivable number of papers. Receiving neither significant financial nor career rewards for performing the onerous duty of peer review, scientists understandably don’t knock themselves out to review papers. Is this any wonder, particularly given that, as Horton points out, there is no reward for high quality reviews and a perverse incentive (i.e., fewer papers to review) for doing low quality reviews? These are the sorts of impediments that have to be changed, along with tightening procedures to make self-peer review far more difficult to achieve. More radical changes could include a system of “open” peer review, such that reviewers are known and their comments follow the published paper, although such a system would present its own challenges, particularly given the reluctance of more junior faculty to publicly criticize the work of more renowned senior faculty.
What is becoming clear is that, whatever changes we make in the peer review system, we can’t keep doing what we’re doing any more. Referencing the Churchill quote, at the moment, as flawed as it is, our peer review system is the best system we have for evaluating science. Until someone can come up with an alternative that works at least as well (admittedly not the highest bar in the world), it would be premature to abandon it. That doesn’t mean it can’t be improved. Contrary to Richard Smith’s view, peer review is not a sacred cow, and it doesn’t yet need to be slaughtered.