Pulse oximeters are a fairly simple, yet nonetheless amazing piece of medical technology that physicians often take for granted. A non-invasive method for monitoring the oxygen content of a patient’s blood with a clip placed on a finger or earlobe, pulse oximeters took on an outsized role when the COVID-19 pandemic hit. The reason is that a key measure of severity of the pneumonia caused by COVID-19 is hypoxia. Normal oxygen saturation of the hemoglobin in the red blood cells is 95% or higher (meaning that 95% of the hemoglobin is carrying oxygen), with saturations of 90% and lower being an indicator of severe disease. Indeed, one feature of COVID-19 that doctors rapidly discovered was a phenomenon called silent hypoxia (sometimes “happy hypoxia“), in which patients are hypoxic (sometimes very hypoxic) with none of the usual symptoms of hypoxia or just minimal symptoms. Stories of patients with dangerously low oxygen saturations scrolling on their phones, chatting with doctors, and generally describing themselves as comfortable rapidly became a staple among doctors treating the then mysterious new coronavirus disease in 2020.

Pulse oximeters, which detected the silent hypoxia, are a critical tool in monitoring patients during surgery or with any disease that can impact the function of their lungs and hearts. They are used to monitor all patients undergoing general anesthesia and heavy sedation, as well as patients in ICUs, emergency rooms, and a number of other healthcare settings. But what if pulse oximeters, usually presumed to be accurate, were not so accurate for a large group of people? Last week, as I was driving to work, I listened to an NPR report about studies showing that pulse oximeters often do not provide accurate estimates of oxygen saturation (SpO2) in patients with darker skin, tending to overestimate SpO2 in patients with darker skin, including Black, Hispanic, and Asian patients, as compared to white patients:

Over the past two years, the pulse oximeter has become a crucial tool for tracking the health of COVID-19 patients.

The small device clips onto a finger and measures the amount of oxygen in a patient’s blood. But a growing body of evidence shows the device can be inaccurate when measuring oxygen levels in people with dark skin tones.

A study published on Monday only adds to this concern.

Researchers analyzing pre-pandemic health data also find those measurements resulted in patients of color receiving less supplemental oxygen than white patients did.

“We were fooled by the pulse oximeter,” says the study’s lead author Dr. Leo Anthony Celi, who’s clinical research director and principal research scientist at the MIT Laboratory of Computational Physiology.

“We were given the false impression that the patients were okay. And what we showed in this study is that we were giving them less oxygen than they needed,” he says.

As an example of potential disparities in COVID-19 care related to this phenomenon, the story included an anecdote about a physician, Dr. Sandra Looby-Gordon, whose son was sick with COVID-19 but for whom pulse oximeter readings gave a false impression that he was not in imminent danger:

Looby-Gordon, who’s a physician at Boston Medical Center, found herself on the phone with a triage nurse at a Florida hospital, arguing that her own son — who was very sick with COVID-19 — needed to be admitted to the hospital.

“‘Well, yeah, he is looking pretty short of breath,'” Looby-Gordon remembers the nurse responding, “‘but his oxygen levels are good.'”

The nurse was basing this on the reading from the pulse oximeter clipped to his finger, but this assessment did not feel right to Looby-Gordon.

She got off the phone with the nurse and spoke with other doctors at her medical center. One of them reminded her of a 2020 article in the New England Journal of Medicine showing the pulse oximeter tends to be inaccurate in people with dark skin tones.

“On top of that, my son is — this sounds strange — but very dark, very dark complexion,” says Looby-Gordon.

Sure enough, later when her son was given a more invasive test for measuring blood oxygen, it showed his oxygen levels were actually dangerously low.

That “more invasive test” was likely an arterial blood draw to measure the oxygen level in the patient’s arterial blood directly, rather than indirectly. Surprisingly, many physicians are not aware of this problem with pulse oximeters. Indeed, the NPR story notes that Dr. Looby-Gordon hadn’t been “fully aware of how the device could be so misleading” before her son got COVID-19 and she had to argue with treating doctors that he was sicker than his home pulse oximeter readings suggested.

This phenomenon is not something that was unknown before COVID-19. However, COVID-19 did bring it into the news and lead to a greater understanding of racial disparities in medical care can show up in unexpected places and how science-based medicine needs to pay more attention to them.

How pulse oximeters work

The most accurate test to determine the oxygen saturation is to sample arterial blood and measure arterial blood gases. An arterial blood gas (ABG), in which case the measurement is referred to as SaO2, as opposed to saturation measurements obtained by pulse oximetry (SpO2). The ABG provides not just SaO2 measurements, but partial pressures of oxygen and carbon dioxide dissolved in the blood, as well as blood pH, bicarbonate, and hemoglobin levels. These measurements allow doctors to determine, among other things, if there’s enough oxygen in the blood and whether there’s an acidosis (too much acid) or alkalosis (too little acid) and whether the acidosis or alkalosis is respiratory or metabolic in origin. However, monitoring ABGs requires either an intra-arterial catheter from which blood can be drawn or periodic arterial sticks to draw arterial blood for measurements. A pulse oximeter, although only providing estimates of SpO2 and pulse rate, is valuable because it’s non-invasive and can take continuous readings.

Pulse oximeters are cool pieces of engineering that take advantage of the different light absorbances of hemoglobin when it doesn’t have oxygen bound to it compared to when it does. In brief, pulse oximeters use a pair of LEDs, one of which emits red light (660 nm wavelength) and the other infrared (940 nm), taking advantage of difference in absorption of that light between deoxygenated hemoglobin (Hb) versus oxygenated hemoglobin (HbO2) in the near-infrared spectrum:

The absorption spectrum of hemoglobin relevant to pulse oximetry

The absorption spectrum of oxygenated and deoxygenated hemoglobin relevant to pulse oximetry.

In brief, HbO2 absorbs more infrared light and allows more red light to pass through, while Hb allows more infrared light to pass through and absorbs more red light. I’ll borrow a nice summary of how the device works from here:

  • The LEDs sequence through their cycle of one on, then the other, then both off about thirty times per second.
  • The amount of light that is transmitted (in other words, that is not absorbed) is measured.
  • These signals fluctuate in time because the amount of arterial blood that is present increases (literally pulses) with each heartbeat.
  • By subtracting the minimum transmitted light from the peak transmitted light in each wavelength, the effects of other tissues is corrected for allowing for measurement of only the arterial blood.
  • The ratio of the red light measurement to the infrared light measurement is then calculated by the processor (which represents the ratio of oxygenated hemoglobin to deoxygenated hemoglobin).
  • This ratio is then converted to SpO2 by the processor via a lookup table based on the Beer–Lambert law.

As a result of this, the pulse oximeter produces waveforms that look like this:

Pulse oximetry waveforms

Pulse oximetry waveforms.

Interestingly, it’s long been known that nail polish can interfere with pulse oximetry readings, rendering them less accurate. (What a surprise! A pigment interferes with readings based on measuring how much light is absorbed at different wavelengths!) Other interesting potentials failure of pulse oximetry have long been appreciated, namely carboxyhemoglobin (due to carbon monoxide poisoning, with carbon monoxide binding to hemoglobin) or methemoglobin (hemoglobin with an oxidized iron atom resulting in increased O2 binding and reduced unloading). In these cases, carboxyhemoglobin and methemoblobin have similar absorption spectra to HbO2, leading pulse oximetry readings to report a falsely elevated SpO2. Under these conditions, pulse oximetry is not accurate estimating SpO2, and physicians learn this during training.

Pulse oximetry in non-white patients

The inaccuracy of pulse oximetry in people with more pigmented skin is an issue that was long (sort of) known, but relatively little attention was paid to this issue before the pandemic hit, leading to an unprecedented influx of patients with hypoxia who needed monitoring, as well as the purchase of more home pulse oximeters than had ever been purchased before. In addition, smart watches and fitness monitors such as the Apple Watch and Fitbit Sense added a pulse oximeter to their health monitoring repertoire and were soon found to suffer from issues that negatively impacted their reliability. As this Washington Post story notes:

The tiny type at the bottom of Apple’s website says its blood oxygen app is “not intended for medical use” and is “only designed for general fitness and wellness purposes.” Fitbit’s small print says its blood-oxygen app is “not intended to diagnose or treat any medical condition” and is useful to “help you manage your well-being and keep track of your information.”

With the reporter noting:

Over several days of comparing my second Apple Watch’s measurements to my FDA-approved finger oximeter, Apple’s readings most often differ by two or three percentage points — though they’ve also sometimes exactly matched, and sometimes have been as much as seven percentage points lower.

And, oddly enough:

When I tested the Apple Watch on a colleague whose skin is darker than mine, the results were also off from the finger pulse oximeter, but less wildly so.

Others have found that the Apple Watch, at least, produces fairly accurate SpO2 readings.

As much as I love my Apple Watch otherwise, I ended up ultimately turning off the pulse oximeter feature because its regular SpO2 measurements were chewing up the battery, leading to greatly decreased battery life, and often reported low oxygen saturation readings that are clearly not accurate.

The first study that looked at pulse oximetry in non-white patients during COVID-19 was published as a letter in 2020 in The New England Journal of Medicine. The study came from my medical alma mater, the University of Michigan, and examined 10,789 pairs of measures of oxygen saturation by pulse oximetry and arterial oxygen saturation in arterial blood gas obtained from 1,333 White patients and 276 Black patients in a University of Michigan cohort and 37,308 pairs obtained from 7,342 White patients and 1,050 Black patients in a multicenter cohort.

The study found:

In unadjusted analyses, the area under the receiver-operating-characteristic curve for detecting an arterial blood gas oxygen saturation of less than 88% according to the oxygen saturation on pulse oximetry was 0.84 (95% CI, 0.81 to 0.87) among Black patients and 0.89 (95% CI, 0.87 to 0.91) among White patients (P=0.003). In the multicenter cohort, the unadjusted analyses involving patients with an oxygen saturation of 92 to 96% on pulse oximetry showed an arterial blood gas oxygen saturation of less than 88% in 160 of 939 measurements in Black patients (17.0%; 95% CI, 12.2 to 23.3) and in 546 of 8795 measurements in White patients (6.2%; 95% CI, 5.4 to 7.1).

Thus, in two large cohorts, Black patients had nearly three times the frequency of occult hypoxemia that was not detected by pulse oximetry as White patients. Given the widespread use of pulse oximetry for medical decision making, these findings have some major implications, especially during the current coronavirus disease 2019 (Covid-19) pandemic. Our results suggest that reliance on pulse oximetry to triage patients and adjust supplemental oxygen levels may place Black patients at increased risk for hypoxemia. It is important to note that not all Black patients who had a pulse oximetry value of 92 to 96% had occult hypoxemia. However, the variation in risk according to race necessitates the integration of pulse oximetry with other clinical and patient-reported data.

That’s roughly one in six Black patients in the multicenter cohort with an SpO2 between 92-96% actually having occult hypoxemia, compared to one in 16 among white patients.

The study discussed in the NPR report last week was published in JAMA and examined a cohort of 3,069 patients in the intensive care unit. Specifically, it was a retrospective cohort study based on the Medical Information Mart for Intensive Care (MIMIC)-IV critical care data set. This dataset includes data from 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center, with patient identifiers removed according to the Health Insurance Portability and Accountability Act (HIPAA), so that the dataset could be used by a wide variety of researchers. Of note, this database is only up-to-date through 2019, which means that it did not at the time of analysis include COVID-19 patients.

The study included patients who were documented with a race and ethnicity as Asian, Black, Hispanic, or White and were admitted to the intensive care unit (ICU) for at least 12 hours before needing advanced respiratory support (if it was needed). Oxygenation levels and nasal cannula flow rates for up to 5 days from ICU admission or until the time of intubation, noninvasive positive pressure ventilation, high-flow nasal cannula, or tracheostomy were analyzed. In this case, rather than comparing SaO2 measurements determined by ABG with those determined by pulse oximeter, the primary outcome measured was time-weighted average supplemental oxygen rate, with covariates including “race and ethnicity, sex, SpO2–hemoglobin oxygen saturation discrepancy, data duration, number and timing of blood gas tests on ICU days 1 to 3, partial pressure of carbon dioxide, hemoglobin level, average respiratory rate, Elixhauser comorbidity scores, and need for vasopressors or inotropes.” (Vasopressors cause blood vessels to relax and expand; inotropes make the heart pump harder and faster.)

What this study adds to the literature is a demonstration that the overestimation of SpO2 in darker-skinned patients results in a lower oxygen use. This lower oxygen use for a given level of unsuspected hypoxemia likely results in poorer outcomes as well, as suggested by a study from 2021 that examined five databases with data from nearly 88,000 patients total that found that discrepancies in pulse oximetry accuracy among racial and ethnic subgroups were associated with higher rates of hidden hypoxemia, mortality, and organ dysfunction and that patients with and without hidden hypoxemia were demographically and clinically similar at baseline ABG measurement, but those with hidden hypoxemia subsequently experienced higher organ dysfunction scores and higher in-hospital mortality.

This racial disparity could well go beyond just patients who are critically ill. A Veterans Administration study published two weeks ago examined patients in general (non-ICU) care in VA hospitals from 2013-2019, concluding:

In general care inpatient settings across the Veterans Health Administration where paired readings of arterial blood gas (SaO2) and pulse oximetry (SpO2) were obtained, black patients had higher odds than white patients of having occult hypoxemia noted on arterial blood gas but not detected by pulse oximetry. This difference could limit access to supplemental oxygen and other more intensive support and treatments for black patients.

With the authors concluding:

Errors in pulse oximeters could be due to a combination of systematic error or bias, which is reproducible across measurements, as well as random error or noise.30 Because pulse oximeter error is due to a combination of both processes, the magnitude of pulse oximeter error might not be the same each time a reading is taken. We empirically show that these errors could result in clinically meaningful differences in the interpretation of pulse oximetry across racial groups. In patients with two pairs of SpO2-SaO2 readings measured on the same day, a well aligned SpO2-SaO2 pair for white patients was associated with low levels of occult hypoxemia on subsequent pairs; such concordance might be reassuring in many clinical scenarios. This concordance was less true for black patients, and these differences should be considered in deciding whether to obtain an arterial blood gas reading in appropriate clinical situations until non-racially biased pulse oximeters are in use.

Although this study found a smaller difference in the rate of hidden hypoxemia between white patients and Black patients, 15.6% compared to 19.6%, it was still significant and could be due to the patient population under study being less ill than in the other studies, which studied critically ill patients. Still, although there has been variability in results examining the accuracy of pulse oximeters, the trend is becoming more clear based on recent data. I could go on, but suffice to say that a growing body of evidence is showing that pulse oximeters can be dangerously inaccurate in some patients with darker skin, as noted in this STAT News story:

It’s been known for decades that the devices are less accurate in patients with darker skin and those wearing nail polish, but new interest and a stream of research about potential racial bias in the devices has been sparked by the racial disparities seen in Covid deaths and treatment. The measurement of oxygen levels using the devices has played a critical role in determining which Covid patients are admitted to the hospital and given supplemental oxygen and other therapies. The devices, invented in the 1970s, were tested on largely white populations.

“This is telling us what we see as disparities could be due to technology that is not optimized for all populations,” said Leo Anthony Celi, a co-author of the JAMA Internal Medicine paper, and an ICU physician and a principal research scientist at the Institute for Medical Engineering and Science at MIT who helped create the large public database of ICU patients used in the study. “We’re seeing the downstream effect. It performs poorly as soon as you apply it outside the demographic it was designed for.”

In June, the FDA updated its safety communication on pulse oximeter accuracy and limitations to announce:

The FDA continues to evaluate all available information pertaining to factors that may affect pulse oximeter accuracy and performance. Because of ongoing concerns that these products may be less accurate in individuals with darker skin pigmentations, the FDA is planning to convene a public meeting of the Medical Devices Advisory Committee later this year to discuss the available evidence about the accuracy of pulse oximeters, recommendations for patients and health care providers, the amount and type of data that should be provided by manufacturers to assess pulse oximeter accuracy, and to guide other regulatory actions as needed. Further details concerning the agenda, timing, and location of the Advisory Committee meeting will be announced in the coming weeks.

Currently, FDA guidance states that clinical trials for the approval of pulse oximeter devices should include at least two darkly pigmented people, or 15% of the subject pool, whichever is larger. It is not at all clear whether this sufficient to make sure that device makers overcome this problem

An unhelpful response

It’s becoming harder to deny that differences in pulse oximeter accuracy based on race exist and could well be harming people of color. Unfortunately, the response on the part of device manufacturers to these findings has not always been encouraging. For instance, in response to the University of Michigan study in 2020, Joe Kiani, founder and CEO of Masimo Corp., published an editorial entitled Pulse Oximeters Are Not Racist in which, after stating that he was co-inventor of the “modern day, measure-through motion and low perfusion pulse oximeter (SET Pulse Oximeter),” he desperately tried to blame anything other than differences in skin pigment for the discrepancy in readings reported between white and Black people. He started by citing sickle cell anemia (sickle cell trait affects 10% of Black populations and can affect how pulse oximeters read), asking if sickle cell trait could account for the differences. He also asked if the pulse oximeters used at the University of Michigan were Masimo pulse oximeters, and wondered if “tissue damage and poor circulation, which afflicts Black people more than any other racial or ethnic group” had affected the pulse oximetry readings. He asked if differences in the percentage of patients in each group with carboxyhemoglobinemia could account for the difference and pointed out that various drugs can cause methemoglobinemia.

Kiani even tried to blame hydroxychloroquine, the anti-malarial drug that was repurposed for COVID-19 and later found not to work:

One of them, hydroxychloroquine, which has been recently used on COVID-19 patients, has been shown to dramatically elevate MetHb in Black patients. MetHb not only causes huge errors in pulse oximetry, including biasing pulse oximetry readings, but also can kill the patient if it’s not detected and treated immediately. Did the Michigan study account for this?

At the end, he became quite indignant:

What these publications did is regretful:

  1. With very little explanation and underlying data, the Michigan authors sent in their findings.
  2. The New England Journal of Medicine published their findings seemingly without asking for the kind of data that you’d expect in a scientific journal.
  3. The Boston Review and the New York Times rushed to give the purported bias in a pulse oximeter a racist narrative.

We need to go back to our meritocracy and not let the acts of some badly behaved people change who we are.

Here’s the thing. Even though Kiani made a few reasonable points about possible confounders, his overall tone was so incredibly defensive and full of attacks on the Michigan authors that it made it hard not to view these points as questionable. Basically, what Kiani did is what so many of my white colleagues do when racial disparities are pointed out to them and it is suggested that they might be indicative of systemic racism; he took the observation as a direct accusation of racism against him to almost a comical extent, including a photo of himself holding a “Black lives matter sign” with the article and bragging about how he and his wife had marched for Black Lives Matter. That’s all very well and good and a credit to him and his wife, but it apparently didn’t prevent him from understanding that systemic racism (or even just a system in which there are racial disparities in health care) can be maintained even if he himself is about as non-racist as a white person can be. It also didn’t help that, rather than responding in a letter to the NEJM, he chose to respond with an editorial in a business journal.

He continued as well, in a press release by Masimo about a study conducted and reported by Dr. Steven J. Barker (Chief Science Officer, Masimo) and Dr. William C. Wilson (Chief Medical Officer, Masimo). It was a retrospective analysis of Masimo laboratory data obtained from black and white volunteer subjects, in an effort to identify differences in Masimo pulse oximeter accuracy and bias between ethnic groups, and its findings were that there was no difference in accuracy. I was curious, though. Was this study ever published in the peer-reviewed biomedical literature? It turns out that it appears not to have been. (At least, if it was, I was unable to find it on PubMed.) This makes Kiani’s charge against the University of Michigan researchers publishing their work as a brief report in NEJM rather…incongruous.

How we should respond

I’ll also say right here that I get it. Kiani’s company is clearly his life’s work, and he felt attacked as being racist (or at least unconcerned about it). However, his response was very unhelpful in that, rather than reacting in a manner that took the observations seriously and asked what might account for the differences between his company’s claimed data and the published reports, Kiani became incredibly defensive and went on the attack. Even though these observations weren’t about him personally, Kiani certainly perceived them as being about him and responded accordingly.

As an aging white guy myself, I understand the urge to take criticism of how we do things in medicine as a personal accusation of racism. You know what helped me learn to check that tendency? Being involved with a program at our cancer institute studying racial disparities in cancer care and outcomes, as well as my several year stint on a statewide quality improvement initiative for breast cancer care. There I learned that there are indeed significant disparities in outcomes in breast cancer. For example, age-adjusted breast-cancer mortality is about 40% higher among Black women than among non-Hispanic White women, even though there is a slightly lower incidence of breast cancer among Black women. The potential causes are several, including decreased access to screening, a higher rate of lacking health insurance resulting in delayed or deferred care (particularly adjuvant treatments), and a number of other factors, some possibly hereditary. A discussion of these reasons is beyond the scope of this post, but the disparity exists. In fact, disparities in cancer outcomes exist for a number of cancers.

In reproductive health care, racial disparities are even more stark. For instance, in 2018 Black mothers in Missouri died during pregnancy, or within one year of pregnancy, at four times the rate of white mothers in 2018. Infant mortality rates are higher for Black pregnancies as well, with 82% of pregnancy related deaths in Missouri being potentially preventable. These disparities go beyond just Missouri, with Blacks and American Indian or Alaska Native (AIAN) pregnancies having pregnancy-related mortality rates that are over three and two times higher, respectively, compared to the rate for White women. These disparities are likely to get worse now that judicial alchemy by the Supreme Court is leading to legislative alchemy to ban abortions in huge swaths of the US, particularly given the vagueness of the wording of the exceptions to such bans when the life of the mother is endangered, although this legislative alchemy will endanger healthcare for all pregnant patients.

The list of these sorts of disparities goes on.

Finally, not only can science-based medicine be used to study the causes of such disparities, in some cases, such as that of pulse oximetry, technological fixes are possible:

In an optics lab at Brown University, PhD student Rutendo Jakachira explains how a pulse oximeter works.

“If you insert your finger in this groove, the LED at the top is sending light through your finger,” says Jakachira. The device can then calculate a patient’s oxygenation by figuring out how much of the light was absorbed by hemoglobin in the blood.

“That’s key to the problem being seen in people with dark skin, says Kimani Toussaint, a professor of electrical and computer engineering, biomedical engineering, and mechanical engineering at Brown University. “It’s assuming that the only absorber of the light energy is the hemoglobin.”

But in reality the skin pigmentation also absorbs the light, he says. And for people with darker skin, that can result in a reading from the pulse oximeter that overestimates the amount of oxygen in their blood.

Toussaint stands next to a table full of technology he hopes will solve the problem.

“I wouldn’t even call this a device yet,” he says.

Unlike current pulse oximeters, the not-quite-yet-a-device uses polarized light which isn’t absorbed by skin pigmentation. If it works correctly, Toussaint says they’ll partner with manufacturers to shrink it all down into a device that could be marketed.

And:

At Tufts University, Valencia Koomson is working on tackling this problem using a different approach.

Her device uses the same kind of light as currently available pulse oximeters do, but it includes technology that can measure a person’s skin tone (people with darker skin pigmentation have higher levels of melanin).

“We can send more light if there’s a higher level of melanin present, so that melanin doesn’t become a confounding factor that obscures our results,” says Koomson, who is an associate professor of electrical and computer engineering.

In the end, though, the case of pulse oximetry is just one example of racial disparities in medicine that science-based medicine needs to address. It’s clearly an important one, and the magnitude of the disparity is unclear; it’s also unusual in that a technical fix might greatly mitigate it. Unfortunately, the rest of the disparities in care based on race, socioeconomic status, and more are not as easily addressed.

One thing is for sure. Whatever the solution is to each disparity, if we react to discussions of these disparities as though we are being personally attacked for being racist, it will only slow the process of researching the causes and solutions to them.

Author

Posted by David Gorski

Dr. Gorski's full information can be found here, along with information for patients. David H. Gorski, MD, PhD, FACS is a surgical oncologist at the Barbara Ann Karmanos Cancer Institute specializing in breast cancer surgery, where he also serves as the American College of Surgeons Committee on Cancer Liaison Physician as well as an Associate Professor of Surgery and member of the faculty of the Graduate Program in Cancer Biology at Wayne State University. If you are a potential patient and found this page through a Google search, please check out Dr. Gorski's biographical information, disclaimers regarding his writings, and notice to patients here.