As a surgeon-scientist who has over the years applied for more research grants than I can currently remember from various funding agencies, including the NIH and other federal entities, I sometimes like to discuss how the NIH determines which grants to fund given that the current budget only allows the NIH to fund roughly 10% of the grant applications that it receives every year. (This is, unfortunately, a situation that has persisted for many years now, despite increases in the NIH budget.) After all, I have submitted far more grant applications to fund my lab to the NIH, the Department of Defense (which, little known to most people, actually funds a fair amount of cancer research), and other agencies than have ever been funded. (I will keep my percent success rate a secret, but it’s fairly low, which is not uncommon among investigators.) At the same time, I realize that a lot of these discussions might seem to be a bit too “inside baseball” for many readers but I consider them important nonetheless. This is because since the start of the COVID-19 pandemic, many conspiracy theories that have cropped up that portray the NIH granting process as based on, in essence, Anthony Fauci—or the NIH director—personally deciding whose grant applications receive funding and only doling out funds to researchers who support with the “government line,” like a mob boss doling out favors to underlings in return for favors done him. Basically, such conspiracy theorists cannot imagine any government institution providing research money to any group opposed to its messaging, or funding any research whose results might not line up with the message “the government” wants to promote.

I rather suspect that the origin of this misconception about how NIH grants are funded, which has always existed but only reached the consciousness of the public outside of antivax conspiracy bubbles since the pandemic hit, comes from projection. Basically, antivax conspiracy theorists who portray Anthony Fauci and the NIH as doling out grant funds to reward supporters and withholding them to punish “dissidents” is exactly how conspiracy theorists would determine whose research is funded and whose is not. They therefore assume that this is how the NIH has always done it. With the pandemic, and Anthony Fauci becoming a major face representing the government response to COVID-19, naturally they personalized this, in particular in light of various “lab leak” conspiracy theories that claim that Fauci funded research in Wuhan that led to the creation of SARS-CoV-2, the coronavirus that causes COVID-19. I’m not going into the weeds of whether the research carried out at Wuhan was truly “gain of function” research (although it does not appear to have been); rather I’m going to focus on how antivax and COVID-19-minimizing sources portrayed grants to Wuhan as having been granted by Fauci personally, such as a story in WorldNetDaily last year titled “New evidence ties COVID-19 creation to research funded by Fauci?” Let’s just say that this histrionic headline was…not accurate.

Even though I have argued on occasion that, whatever its flaws, the NIH grant selection process is about as close to a true meritocracy as any government agency gets, that doesn’t mean that there isn’t a lot of room for improvement. “As close to a true meritocracy” does not in this case mean anywhere near close enough. One common criticism of NIH granting processes dating back decades is that the study sections that evaluate and score grant applications based on scientific merit are too “conservative” and tend to favor “safe” research that might produce incremental results but nothing ground-breaking. Unsurprisingly, this is a problem that is perceived, at least, to be much worse in times of tight funding, when the top 10% or less of grant applications are funded. After all, when funds are tight, study sections feel less comfortable supporting research that is more likely to go nowhere and thereby “waste” NIH funds.

Indeed, years before the pandemic, I took issue with John Ioannidis, who made this very argument, railing against how the NIH supposedly favored “conformity” and “mediocrity”. Let’s just say that, although I see some merit in these arguments, I also strongly believe that the critics who make them have a rather exaggerated view of the supposed “brave mavericks” who are missing out on NIH funding supposedly because they are too brilliant, creative, and “outside the box” and have pointed out that the anecdotes used to make this argument are often missing important context that makes NIH decisions not to fund the work more understandable. (In one case, this involved the knowledge that other scientists were having trouble replicating the experimental results used to support the grant application.) Then there is the issue of how study sections would choose between high risk “outside the box” studies to fund, given that such studies almost by definition don’t have a lot of preliminary evidence to support them and it’s hard to judge how plausible the hypotheses being tested are to pick out the most promising study the most likely to lead to a major breakthrough. (A lot of the anecdotes used to support funding “riskier” research often suffer from selective memory and major hindsight bias.) That is, however, a discussion that I’ve had before and an updated version could be a topic for another day.

What led me to want to discuss how NIH grants are funded is another shortcoming in the NIH review process, prompted by a story published on Friday in Nature about a proposal to change the way that NIH grants are scored in order to remove what is known as “reputational bias”, one form of bias in grant scoring that inarguably does still exist. In brief, this form of bias was documented in a January 2022 NIH analysis, which found that the 10% of institutions that receive the most money from the agency get about 65% of its overall funding for research projects and the bottom half receives less than 5%, proportions that have remained persistent for years. Of course, researchers not at the “top” schools have known this for decades, referring to it as how the “rich get richer”.

First, however, let’s look at the NIH grant process.

How an NIH grant is reviewed

Contrary to the conspiracist vision of Anthony Fauci—or any other Institute director at NIH, or even the NIH director him or herself —personally viewing every grant application and deciding who gets those sweet, sweet NIH dollars and who does not, there is a long-defined, rigorous, and codified process used by the NIH to evaluate grants applications. It begins with the submission of a grant to the NIH. Before I discuss what happens next, I will note that the NIH has a number of granting mechanisms designed for different purposes. For example, the R21 grant is designed for preliminary work, often the “higher risk” studies that the brave mavericks demand, and doesn’t require a lot of preliminary data. (The claim that it can require no preliminary data, however, is generally nonsense. You need at least some data.) R21s can fund up to two years and cannot be renewed.

In contrast, the granddaddy of them all, the “gold standard” grant for an individual investigator, small groups of co-investigators, or collaborators, is the R01, which can be funded for up to five years (it’s also one of the only grant mechanisms where the investigator can propose basically anything, rather than having to address a particular topic or question). At the end of that time, the investigator can apply for a competitive renewal, which can extend the grant for up to another five years, and so on ad infinitum. There are a number of other grant mechanisms, which include training grants for graduate students, center grants (e.g., for cancer centers), larger multi-investigator grants, and more targeted grants—Wikipedia has a nice list here—but in general all of them are scored by groups of scientists with the relevant expertise in a review group called a study section, of which there are dozens in the NIH arranged by topic into Review Branches at the Center for Scientific Review. Many of these study sections are permanent, but the NIH can and does set up temporary study sections for topics of special interest at the time.

NIH grants generally undergo two levels of review, first the study section and then advisory councils. To guide the reviews, the NIH has five criteria that it uses to evaluate grant applications:

  • Significance
  • Investigator(s)
  • Innovation
  • Approach
  • Environment

Other considerations include “Additional Review Criteria”:

As applicable for the project proposed, reviewers will evaluate the following additional items while determining scientific and technical merit and in providing an overall impact score, but will not give separate scores for these items.

  • Study Timeline (specific to applications involving clinical trials)
  • Protections for Human Subjects
  • Inclusion of Women, Minorities, and Children
  • Vertebrate Animals
  • Biohazards
  • Resubmission
  • Renewal
  • Revision

Additional Review Considerations. As applicable for the project proposed, reviewers will consider each of the following items, but will not give scores for these items and should not consider them in providing an overall impact score.

  • Applications from Foreign Organizations
  • Select Agent
  • Resource Sharing Plans
  • Authentication of Key Biological and/or Chemical Resources
  • Budget and Period Support

Grants undergo anonymous peer review, and usually each application is reviewed by three or four reviewers, with one of them being a statistician where appropriate. Those assigned to do the detailed reviews score each of the above areas from 1-9, although in this case low scores are better, denoting high impact/priority, and then assign an overall impact score to the grant application. During the study section meeting, the study section member assigned as primary reviewer starts the discussion with a summary of the grant application, the score assigned to it, and why that score was assigned. Then the others who evaluated each grant application do the same, after which the whole study section discusses the application. At the end, every member assigns an overall impact score to the grant under discussion before moving on to the next application. After the study section meeting, all the impact scores are used to calculate a final overall Priority Score assigned to the grant application. Also, the membership rosters of the study sections are public knowledge, as they are published on the CSR website; so it’s not uncommon for investigators who got a bad review among their reviews to make a good guess about who was responsible.

Although I’ve never served as a permanent member of an NIH study section, I have served as an ad hoc member for specific expertise. (Ad hoc members generally serve for only one or a handful of grant review cycles, rather than being assigned for multi-year stints.) As such, I can only comment on the dynamics of study sections in which I’ve participated. One thing that I’ve noticed is that it only takes one highly negative review from a reviewer who is outspoken to tank an application. I’ve also noted that someone who really likes a grant application can sway others to score it lower, but the effect seems a bit weaker than “negative campaigning” does.

Obviously, environment is the criteria most subject to reputational bias, because it requires the reviewer to determine if the university or institution at which the research is to be performed has the facilities and expertise to give the proposed project the highest chance of success. Naturally, being at a university like Harvard, Yale, or Stanford (for example) will provide an investigator an inherent advantage, because such universities have more resources and expertise than mid- or lower-tier universities.

When an individual grant application hits the NIH, it will be assigned to a study section. Investigators can influence this process by suggesting study sections, and, unsurprisingly, some study sections have reputations for being harsher than others. Once a grant is assigned to a study section, that study section’s Scientific Review Officer (SRO) will read it, decide if it’s appropriate for the study section, and assign reviewers:

Assignment of Applications to Specific Reviewers: The SRO assigns applications to particular reviewers by matching the science in the application to the reviewer’s expertise. Assignment considerations include: reviewer knowledge about, and interest in, the goals of the project; expertise in the techniques proposed; reviewer workload; and real or perceived conflicts of interest. The SRO encourages reviewers to let him/her know of any concerns that they have about their assignments. This would include conflicts of interest, concerns about the appropriateness of the assignment, or the need for additional expertise.

The SRO also recruits scientists to serve on the study section, thusly:

Identifying and Recruiting Reviewers: Possibly the most important role of the SRO is to ensure that the reviewers present at the study section meeting have all the needed expertise to evaluate the applications under review.

In choosing regular members for study sections, it is essential that the SRO recognizes current trends in the field and ensures that the membership reflects where the field is now and where it is going. It is also critical that the expertise of each nominee complements that of the other members and strengthens the study section as a whole.

As you can see, the SRO is a big deal.

The SRO also runs the study section meeting, collates the reviews, takes notes, and from those notes and the overall reviews produces a Summary Statement that includes the overall Priority Score assigned (with a percentile measurement denoting the percentage of grant applications that were scored higher than the applicant’s), comments about the discussion at the study section, and the original “raw” reviews from each study section member who reviewed the grant. Note that now generally only grants that score in the top one-third to one-half receive overall Priority Scores and Summary Statements, because any grant with higher (and therefore worse) scores are so unlikely to be funded as to make it not worth the SRO’s effort to put the documents together. These investigators do, however, still receive the reviews carried out by individual study section members. SROs also handle appeals from applicants who question whether their review was fair or whether one or more of the reviewers had the requisite expertise.

The second level of review occurs through committees formed at each Institute and Center (IC) at the NIH called Advisory Councils:

The Advisory Council/Board of the potential awarding Institute/Center performs the second level of review (See Advisory Councils or Boards). Advisory Councils/Boards are composed of scientists from the extramural research community and public representatives (NIH Federal Advisory Committee Information). Members are chosen by the respective IC and are approved by the Department of Health and Human Services. For certain committees, members are appointed by the President of the United States.

Now here’s the part where the conspiracy might come in:

Recommendation Process

  • NIH program staff members examine applications and consider the overall impact scores given during the peer review process, percentile rankings (if applicable) and the summary statements in light of the Institute/Center’s priorities.
  • Program staff provide a grant-funding plan to the Advisory Board/Council. Council members have access to applications and summary statements pending funding for that IC in that council round.
  • Council members conduct a Special Council Review of grant applications from investigators who currently receive $1 million or more in direct costs of NIH funding to support Research Project Grants (see NOT-OD-12-140). This additional review is to determine if additional funds should be provided to already well-supported investigators and does not represent a cap on NIH funding.
  • The Advisory Council/Board also considers the Institute/Center’s goals and needs and advises the Institute/Center director concerning funding decisions.
  • The Institute/Center director makes final funding decisions based on staff and Advisory Council/Board advice

It’s that last part that the conspiracy theorists harp on. In theory, an Institute director like Anthony Fauci could override all the peer review to fund a grant, but in practice it almost never happens. Why? Because the NIH set up this process in order to minimize the possibility direct involvement of its leaders in picking and choosing grant awardees based on personal whim. The whole system exists to try to ensure as much as possible that grant selection is based on scientific merit, and few other considerations.

I won’t go through the whole appeals process, other than to say that appeals rarely succeed. I also won’t say that politics and the personal preferences of various Institute/Center (IC) directors never play a role in grant funding decisions, but I will say that the system is set up to minimize that role. Also, often the people who most characterize NIH funding decisions as the personal doling out of funding by directors are the same ones who have no compunction about trying to shut down research they don’t like themselves—and have been for a long time.

Minimizing reputational bias

With that background in hand, let’s take a look at what the NIH is proposing to do now to decrease reputational bias in its funding decisions. According to Nature:

The US National Institutes of Health (NIH) has released a tentative plan to change how its research grant applications are scored, with the aim of reducing bias and lowering the burden on reviewers. Under the new system, reviewers would no longer rate researchers’ expertise or their institutions’ access to resources, and there would be fewer scoring criteria overall.

The NIH’s Center for Scientific Review (CSR), which organizes the peer-review groups that evaluate more than 90% of the research grants awarded by the agency, announced these proposed changes at a meeting on 8 December attended by Lawrence Tabak, acting NIH director, and a panel of his advisers. The revamp has not yet been finalized, and any changes would not be implemented until 2024 at earliest.

So here is the main proposed change:

Reviewers currently score NIH research proposals according to five criteria: significance, investigator(s), innovation, approach and environment (where the research will be carried out). These criteria are defined by US legislation, so the NIH cannot modify them without approval from lawmakers, but it can change the way they’re interpreted or scored. The new system doesn’t throw out the old criteria, but groups them into three categories: the importance of the research, its feasibility and rigour, and the expertise and resources of the researcher and their institution.

Byrnes says that the last category, which won’t be scored under the proposal, is frequently misinterpreted. Reviewers sometimes score applicants and their institutions without considering them in the context of the proposed research — the original intention of the category. This has led to higher scores for prestigious institutions and individuals. Under the proposal, rather than score this category, reviewers would choose whether they think researcher expertise or institutional resources are adequate or not. If they select the latter, they can leave specific feedback about deficiencies in a text box on the review form. This will “prevent reviewers from waxing poetic about a really famous investigator that tilts the evaluation of the science”, Byrnes says.

Again, the “rich get richer” applies not just to high-reputation universities and institutions, but to individual researchers. Indeed, in 2017 the NIH proposed a plan to limit how many NIH grants any one individual researcher could hold at any one time, the idea being to “spread the wealth around” more after a report had shown that 10% of grant recipients received 40% of NIH grant funding. While it is expected that top researchers would be more effective at competing for research grants and would thus have more funding than recipients who were not as top tier, the NIH thought the disparity was too great and also wanted to direct more funding to young scientists in order to nurture the next generation. These concerns led to a proposal that applications from scientists who already controlled more than $1.5 million in NIH grants undergo an additional layer of review, a situation that only applied to 5% of grant awardees. The pushback was so fierce that the NIH ultimately backed down and scrapped the proposal. Instead, it The NIH announced the creation of a special fund drawn from its existing budget to be targeted at early- and mid-career scientists in an attempt to lower the average age of the researchers it supports.

In any event, here’s what the NIH proposes, specifically, with respect to its proposed simplified review criteria:

NIH proposes to reorganize the five review criteria into three factors, with Factors 1 and 2 receiving a numerical score. Reviewers will be instructed to consider all three factors (Factors 1, 2 and 3) in arriving at their Overall Impact Score (scored 1-9), reflecting the overall scientific and technical merit of the application.

  • Factor 1: Importance of the Research (Significance, Innovation), numerical score (1-9)
  • Factor 2: Rigor and Feasibility (Approach), numerical score (1-9)
  • Factor 3: Expertise and Resources (Investigator, Environment), assessed and considered in the Overall Impact Score, but not individually scored

Within Factor 3 (Expertise and Resources), Investigator and Environment will be assessed in the context of the research proposed. Investigator(s) will be rated as “fully capable” or “additional expertise/capability needed”. Environment will be rated as “appropriate” or “additional resources needed.” If a need for additional expertise or resources is identified, written justification must be provided. Detailed descriptions of the three factors can be found here.

Unsurprisingly, some scientists are not thrilled with the changes:

Some advisers attending the 8 December meeting pushed back on the plan, suggesting that researchers and institutional resources are crucial factors in determining the merit of research projects. “I do think there’s some value in some objective score to assess the investigator,” said Shelley Berger, an epigeneticist at the University of Pennsylvania in Philadelphia. Without a score, Berger added, it could be difficult to understand the reviewer’s thinking and how they factored researcher expertise into their decision. Speaking at the meeting, Byrnes countered that reviewers would still have the option to leave comments about their concerns, which could be reflected in the overall impact score.

Personally, having reviewed grants, I see this as more or less a non-issue. If specific reviewers are savvy enough to know the capabilities of a given institution and whether they align with what is required to successfully complete the proposed project, then I see no reason under the new framework that they can’t state that and factor it into their overall impact score. More importantly, as I’ve discovered, mid-tier universities often have specific strengths, just not as many subject areas of strength as the “top tier” universities, but they are not infrequently penalized just based on their overall reputation compared to, say, Harvard or Stanford.

Of course, there are those who think this proposal doesn’t go far enough:

Although some critics appreciate that the agency is trying to eliminate reputational bias, they say the proposed changes do not address the root of funding disparities at the NIH. Omolola Eniola-Adefeso, a biomedical engineer at the University of Michigan, Ann Arbor, who co-authored a 2021 article calling on the NIH to fund more Black scientists1, tells Nature that, to have a tangible impact, more diversity is needed among reviewers. She cites a landmark analysis published more than a decade ago2, which found that Black researchers were significantly less likely to receive NIH research funding compared with white researchers, even when factors such as scientific credentials and employer were taken into account. This means efforts to lessen the impact of reputational bias might not translate to more equitable funding, she says.

In addition to considering the race and gender of reviewers, ensuring that panels have representation from under-represented institutions could have a big impact, says Enrique Neblett, a public-health researcher at the University of Michigan. Given the persistence of funding disparities, the NIH needs to take bolder action, he says.

I know what some “purists” are going to say here, namely that scientific merit should be the be-all and end-all of determining who gets NIH grants. (Actually, they’ll say worse, but I’m being kind.) While I’m sympathetic to such an argument in the abstract, I also realize that the NIH is nonetheless an agency of the US government. I also realize that there is no such thing as true “colorblindness” when it comes to functions like evaluating grant applications, as can be seen from findings that disparities in funding success persist even when credentials and institution are taken into account. It’s a delicate balancing act between maintaining the scientific rigor of the process while promoting polices that can benefit society by supporting younger researchers and diversifying the scientific workforce, both of which can contribute to better science funded by the NIH in the future that addresses the problems in medicine and public health most relevant to our population.

Reality versus conspiratorial fantasy

One thing that this whole debate over how best to allocate limited NIH funding to scientists submitting grant proposals should dispel is the notion that the system in any way resembles the fantasy version portrayed by antivax conspiracy theorists, quacks, and grifters. Just because they view everything through a strictly transactional lens and can’t imagine any government institution not operating basically the same way does not mean that that’s how the NIH works. The NIH grant process has a lot of shortcomings, but it really does strive to be as much of a meritocracy as possible while at least not undermining societal goods, such as a more diverse workforce and supporting young scientists so that they don’t abandon biomedical research because they can’t obtain funding to support their laboratories and research programs. The proposed changes in NIH grant review criteria seek to strike this balance while maintaining scientific rigor.

What the proposed changes should help to emphasize is that the NIH, for all the deficiencies both real and exaggerated in its study section system, is constantly seeking ways to change the application process so that the most scientifically impactful and meritorious grants receive funding, even if a lot of meritorious grants do not, simply because there is not enough funding. Being a system set up by humans, the NIH will never achieve a perfect system for funding grants. However, contrary to the conspiracy theory, the NIH is set up to try to minimize the effects of IC directors (or anyone else) picking winners and losers based on a quid pro quo or personal whim, and that’s one of the things that makes the NIH great.

Author

Posted by David Gorski

Dr. Gorski's full information can be found here, along with information for patients. David H. Gorski, MD, PhD, FACS is a surgical oncologist at the Barbara Ann Karmanos Cancer Institute specializing in breast cancer surgery, where he also serves as the American College of Surgeons Committee on Cancer Liaison Physician as well as an Associate Professor of Surgery and member of the faculty of the Graduate Program in Cancer Biology at Wayne State University. If you are a potential patient and found this page through a Google search, please check out Dr. Gorski's biographical information, disclaimers regarding his writings, and notice to patients here.