This article, unfortunately, is not about the science of zombies, but rather science that continues on after it has effectively died. What makes a scientific study dead (and undead)?
The most obvious and definitive way in which a scientific study can be officially pronounced dead is if it is retracted. Scientific research is born into life (to continue this metaphor) when it is published in the peer-reviewed literature. Being retracted from the journal in which it was published is scientific death (or at least it should be).
Retraction watch, which is an excellent resource, cited over 650 retractions in 2016. There has been an overall trend of increasing retractions, up about 10 fold from 2000. The reasons for retraction vary – about 11% are for fabrication or falsification of data, 17% for self-plagiarism, 16% for plagiarism, 28% for honest errors, and 11% for irreproducible results. The majority of retractions come from high-impact journals and come from the biomedical sciences.
Retractions themselves are not necessarily a problem. They represent the peer-reviewed literature policing itself and correcting errors when found. The real question is – what happens to a paper after it is retracted? One way to quantitatively address that question is to track citations. A citation is when one scientific paper references another. This usually means that the paper is building on the previous work in some way. It may be literally reviewing previous studies, or just using it to support its conclusions or premises.
As an aside, one bias in the literature that we have not discussed enough on SBM is citation bias. This is a clearly demonstrated bias for researchers to cite positive studies over negative studies, regardless of study quality. Essentially this is the scientific literature equivalent of confirmation bias.
Citations are also the way journals measure their impact factor. The more their articles are cited by other articles, the higher their impact factor (which relates to how profitable they are). Therefore there is a clear incentive to publish positive studies over negative studies, and in general studies that are more likely to be cited (because they are new and interesting). These are also the studies most likely to be retracted, which may explain why high impact journals have a relatively high retraction rate.
So what happens to citations when an article is retracted? The online version of that article is then watermarked with a big “Retracted” label or a notice at the top indicating the article is now retracted. Print versions that are out there in the world are, of course, unaltered. Online curators of peer-reviewed articles, such as Pubmed and Medline, are also notified. Is this enough? Probably not.
DARPA has been looking into this question. They found that one paper published in 2005 and retracted in 2010 was cited 667 times, about half of those citations after it was retracted. That is a zombie paper, living on in an undead state after its official death by retraction.
The ripple effect of citations is also huge. Not only can a paper be cited many times, but the papers that cite it are also cited. Even if we just consider these second degree citations, a typical paper can have thousands of such connections, and a high impact paper, tens of thousands. That is the nature of modern science – researchers do not usually work in isolation. The literature is a giant web of communication and collaboration.
There are deeper questions that need to be researched. What is the effect of citations to retracted papers? Do they invalidate the results or conclusions of the paper making the citation? This is most likely for systematic reviews and meta-analysis – if one of the papers you relied upon in your review is revealed as bogus, that might dramatically change the results of the review. Other citations might be negligible, merely one of many citations simply to establish some basic premise, and unaltered by a single retraction.
We can think about what should happen in a perfect world. I think this is yet another situation in which we are not optimally leveraging the power of digital information. When a paper is retracted, it should no longer be possible to cite it. If a new paper is submitted that cites a retracted paper, that should be caught in the editorial review. Researchers should be able to easily determine if any paper they are trying to cite is a zombie paper.
Further, once retracted every lead author of every paper that cited the now retracted study should get a message and should be essentially forced to revise their paper to account for the retraction. This may be as simple as removing the citation, but it may also require changing the text if it refers to or depends on the now defunct citation. It may even require reanalyzing data to remove the retracted data from the analysis. The worst case scenario is that a paper which depended upon the now-retracted citation may itself need to be retracted, with the process repeating for any paper that cited it.
In other words, the retraction should have the same ripple effect throughout the web of scientific literature that the original paper had, extricating its tendrils from the web and making all needed corrections. Yes, this is a lot of work, but that is precisely why we need to leverage the power of computers to make it happen. This also sounds like a job for AI software, and is likely a lot less complicated than Google’s search algorithms. Essentially we have the technology to do this if we wanted to.
The bigger picture is that the scientific literature has not completely made the transition from traditional print journals to an online digital existence. That transition needs to be completed, with journals, search engines, and libraries rebuilt from the ground up digitally. How and what gets “published” should be optimized for scientific progress using digital technology, and not tied to the old restrictions. Editors should no longer be able to say things like, “We don’t have the space to publish boring exact replications.” The idea of “space” when it comes to publishing is a paper restriction, not a digital restriction.
Further, the response to retractions should be immediate and absolute. Those zombie papers need to be staked through the heart (OK, that’s vampires, but you get the idea) and permanently laid to rest. It should simply not be possible to have a single citation after being retracted.
Retraction, of course, is not the only problem with published studies or the only way in which a study can become a “zombie.” Some studies simply come to conclusions that we now know are wrong, even though the study itself was legitimate. That should be noted also.
DARPA is looking into ways to label published scientific studies with a rating system that indicates how reliable the results are. Impact factor is not enough, and in fact may be misleading as it is biased towards positive studies that are, if anything, more likely to be retracted. I have suggested previously that preliminary studies should be clearly labeled with a warning that their results are preliminary, exploratory, or speculative only, and should not be cited as reliable evidence. “For use by other researchers only!”
I seriously think a company like Google should be brought in and given a grant to develop a system for monitoring, tracking, and grading studies in a more thorough way that is optimized for scientific utility, progress, and reporting in the media. This is not meant to replace the collective judgement of the scientific community, but to codify it and make it more transparent.
It’s time to build a scientific literature infrastructure for the 21st century.