• HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
Thursday, September 11, 2025
BIOENGINEER.ORG
No Result
View All Result
  • Login
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
No Result
View All Result
Bioengineer.org
No Result
View All Result
Home NEWS Science News Health

Scientists use machine learning models to help identify long COVID patients

Bioengineer by Bioengineer
May 17, 2022
in Health
Reading Time: 5 mins read
0
SARS
Share on FacebookShare on TwitterShare on LinkedinShare on RedditShare on Telegram

CHAPEL HILL, NC – Clinical scientists used machine learning (ML) models to explore de-identified electronic health record (EHR) data in the National COVID Cohort Collaborative (N3C), a National Institutes of Health-funded national clinical database, to help discern characteristics of people with long-COVID and factors that may help identify such patients using data from medical records.

SARS

Credit: NIAID

CHAPEL HILL, NC – Clinical scientists used machine learning (ML) models to explore de-identified electronic health record (EHR) data in the National COVID Cohort Collaborative (N3C), a National Institutes of Health-funded national clinical database, to help discern characteristics of people with long-COVID and factors that may help identify such patients using data from medical records.

The findings, published in The Lancet Digital Health, have the potential to improve clinical research on long COVID and inform a more standardized care regimen for the condition.

“Characterizing, diagnosing, treating and caring for long-COVID patients has proven to be a challenge due to the list of characteristic symptoms continuously evolving over time,” said first author Emily R. Pfaff, PhD, assistant professor in the Division of Endocrinology and Metabolism at the UNC School of Medicine. “We needed to gain a better understanding of the complexities of long-COVID, and for that it made sense to take advantage of modern data analysis tools and a unique big data resource like N3C, where many features of long COVID are represented.”

Sponsored by the National Institutes of Health’s National Center for Advancing Translational Sciences (NCATS), the N3C data enclave currently includes information representing more than 13 million people from 72 sites nationwide, including nearly 5 million COVID-19-positive cases. The resource enables rapid research on emerging questions about COVID-19 vaccines, therapies, risk factors and health outcomes.

This new research is part of the National Institutes of Health’s Researching COVID to Enhance Recovery (RECOVER) initiative, which has been recruiting thousands of participants nationwide in order to answer critical research questions about the syndrome to accurately identify who has long-COVID, risk factors for long-COVID, and potential interventions and treatments.

Using the N3C, researchers developed XGBoost machine learning (ML) models to understand patient characteristics and better identify potential long-COVID patients.

Researchers examined demographics, healthcare utilization, diagnoses, and medications for 97,995 adult COVID-19 patients. They used these features on nearly 600 long-COVID patients from three long-COVID specialty clinics to train and test three ML models, which focused on identifying potential long COVID patients in three groups:: among all COVID-19 patients, among patients hospitalized with COVID-19, and among patients who had COVID-19 but were not hospitalized.

The models proved to be accurate in identifying potential long-COVID patients, achieving areas under the receiver operator characteristic curve, a measure of accuracy used by machine learning researchers, of  0.91 (all patients); 0.90 (hospitalized); and 0.85 (non-hospitalized). Patients flagged by the models can be interpreted as “patients warranting care at a long-COVID specialty clinic.” Applying the model to the larger N3C cohort can also achieve the urgent goal of identifying long-COVID patients for clinical trials.

The models also showed many important features that differentiate potential long-COVID patients from non-long-COVID patients. They focused on patients with a positive COVID diagnosis who were at least 90 days out from their acute infection. Features more commonly identified among potential long COVID patients include post-COVID respiratory symptoms and associated treatments, non-respiratory symptoms widely reported as part of long COVID (such as sleep disorders, anxiety, malaise, chest pain, and constipation), pre-existing risk factors for greater acute COVID severity (such as chronic pulmonary disease, diabetes, and chronic kidney disease), and proxies for hospitalization, suggesting greater severity of acute covid. The study also points out that it is plausible that long-COVID will not ultimately have a single definition, and may be better described as a set of related conditions with their own symptoms, trajectories, and treatments.

“These results speak to the powerful impact of real-world clinical data and the potential capabilities of N3C to help better understand and find solutions for significant public health problems such as long COVID,” said NCATS Acting Director Joni Rutter, PhD.

Josh Fessel, MD, PhD, senior clinical advisor at NCATS and a scientific program lead in RECOVER, added, “Once you’re able to determine who has long COVID in a large database of people, you can begin to ask questions about those people. Was there something different about those people before they developed long COVID? Did they have certain risk factors? Was there something about how they were treated during acute COVID that might have increased or decreased their risk for long COVID?”

The study included how electronic health record (EHR) data is skewed toward patients who make more use of healthcare systems. Pfaff says that it is essential to acknowledge whose data is less likely to be represented – uninsured patients, patients with limited access to or ability to pay for care, or patients seeking care at small practices or community hospitals with limited data exchange capabilities.

“Electronic Health Records (EHRs) only have information for people who go to the doctor,” said Pfaff, who is also Co-Director of the NC TraCS Informatics and Data Science (IDSci) Program. “They also have more information on people who go to the doctor a lot. So, people who don’t have good access to care or people who don’t go to the doctor, we’re just not going to have information about them. So this is a caveat that I offer with every EHR based study that I do. We need to recognize who’s not in the dataset.”

The N3C team continues to refine its models as more real-world data emerges. Their longitudinal data for COVID-19 patients can provide a comprehensive foundation for the development of ML models to identify potential long-COVID patients. As larger cohorts of long-COVID patients are established, future work will include research to identify subtypes of long-COVID, making the condition easier to study and treat.

“Depending on where the research leads, we may find that patients with different presentations of long COVID are different enough to warrant different treatments entirely,” said Pfaff. “So, it’s important for us to determine if long COVID is one disease, or a constellation of related conditions that are also related to having had acute COVID-19.”

With the help of this big data approach, efficient study recruitment efforts can become available to deepen the understanding and complexities of long-COVID. Beyond identifying cohorts for research studies, understanding and validating the relationship between long-COVID and social determinants of health and demographics, comorbidities, and treatment implications will only improve the algorithm in these models as more evidence emerges.

“Research studies, particularly clinical trials, are one of our best tools for gaining understanding of long COVID — its presentation, risk factors, and potential treatments,” said Pfaff. “For the best chance at success, studies need large and diverse groups of participants who qualify, which aren’t easy to find. Using algorithms like the one we’ve created on large clinical datasets can narrow down vast numbers of patients to those who could qualify for a long COVID trial, potentially giving researchers a head start on recruitment, making trials more efficient, and hopefully getting to findings faster.”

This study was funded by NCATS and NIH through the RECOVER Initiative.

About the National Center for Advancing Translational Sciences (NCATS): NCATS conducts and supports research on the science and operation of translation — the process by which interventions to improve health are developed and implemented — to allow more treatments to get to more patients more quickly. For more information about how NCATS helps shorten the journey from scientific observation to clinical intervention, visit https://ncats.nih.gov.



Journal

The Lancet Digital Health

DOI

10.1016/S2589-7500(22)00048-6

Method of Research

Data/statistical analysis

Subject of Research

People

Article Title

Articles|Online First PDF [2 MB] Figures Save Share Reprints Request Identifying who has long COVID in the USA: a machine learning approach using N3C data

Article Publication Date

17-May-2022

Share12Tweet8Share2ShareShareShare2

Related Posts

“Bioavailability of Umbelliferone: Metabolism & Extraction Insights”

September 11, 2025

Inner Cell Mass and Blastulation Impact Pregnancy Success

September 11, 2025

New Research Reveals Indigenous Amazon Forests Help Curb Spread of 27 Diseases Across Eight Countries

September 11, 2025

Indigenous Amazon Territories Promote Human Health, Study Finds

September 11, 2025

POPULAR NEWS

  • blank

    Breakthrough in Computer Hardware Advances Solves Complex Optimization Challenges

    152 shares
    Share 61 Tweet 38
  • New Drug Formulation Transforms Intravenous Treatments into Rapid Injections

    116 shares
    Share 46 Tweet 29
  • Physicists Develop Visible Time Crystal for the First Time

    63 shares
    Share 25 Tweet 16
  • First Confirmed Human Mpox Clade Ib Case China

    56 shares
    Share 22 Tweet 14

About

We bring you the latest biotechnology news from best research centers and universities around the world. Check our website.

Follow us

Recent News

“Bioavailability of Umbelliferone: Metabolism & Extraction Insights”

Scientist, Advocate, and Entrepreneur Lucy Shapiro Honored with Lasker-Koshland Special Achievement Award

Breakthrough Nano-Switch Enables Precise Control of Chargeless Quantum Information Flow

  • Contact Us

Bioengineer.org © Copyright 2023 All Rights Reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Homepages
    • Home Page 1
    • Home Page 2
  • News
  • National
  • Business
  • Health
  • Lifestyle
  • Science

Bioengineer.org © Copyright 2023 All Rights Reserved.