New Artificial Intelligence Algorithm Reveals Long COVID Prevalence Twice as High as Previously Estimated Across U.S. Hospitals
Groundbreaking research emerging from Mass General Brigham has unveiled that the true burden of long COVID may be starkly underestimated by current surveillance methods, potentially doubling previously reported figures. Leveraging innovations in artificial intelligence, researchers meticulously analyzed the electronic health records of nearly 460,000 COVID-19 patients from a widespread cohort spanning 58 hospitals across the United States. Their findings, published in JAMA Network Open, indicate that approximately 16.3 percent of these patients developed long COVID—a chronic and heterogeneous condition characterized by persistent multisystem symptoms post-acute infection—translating into an alarming estimate exceeding 18 million Americans affected.
Traditional public health monitoring has primarily relied on diagnostic coding such as the ICD U09.9 code assigned for post-COVID conditions to track long COVID incidence. However, this conventional methodology captures fewer than 7 percent of actual cases, resulting in significant underreporting. The Mass General Brigham team addressed this critical gap by creating a novel precision-phenotyping AI algorithm designed specifically for longitudinal electronic health records analysis. This tool scrutinizes temporal sequences of clinical events to detect new-onset syndromic manifestations that cannot be attributed to preexisting conditions, effectively distinguishing true post-COVID pathologies from comorbidities.
The AI’s capacity to function as a diagnosis of exclusion involved an intricate algorithmic approach, systematically identifying patterns and temporal associations that correlate with the sequelae of SARS-CoV-2 infection. The analysis encompassed diverse geographic U.S. regions—including New England, Southeast Texas, Southern California, and Western Pennsylvania—revealing variable long COVID prevalence rates ranging from 13.6% to 22.7%. Intriguingly, the study also highlighted significant regional disparities in specific long COVID manifestations, such as varying incidences of prediabetes, which has emerged as a notable metabolic consequence of the condition.
Contrary to earlier assumptions framing long COVID as predominantly a legacy of initial pandemic waves, the data analysis demonstrated a sustained upward trajectory in cumulative prevalence across all examined regions. This persistent increase underscores SARS-CoV-2 as a continuing catalyst for the development of diverse chronic conditions impacting multiple organ systems. The longitudinal nature of this study, utilizing electronic health records spanning extensive temporal windows, allowed for dynamic statistical modeling. It predicted that, without substantial changes in current trends, the long COVID burden will expand exponentially over the coming decade, posing a formidable public health challenge.
The research team also underscored limitations inherent in the study’s methodology. Notably, their calculations excluded individuals with undocumented infections, which now constitute a growing majority given the cessation of widespread testing, as well as patients lacking comprehensive longitudinal medical records. These gaps imply that the actual prevalence and clinical diversity of long COVID may be even more expansive than reported, emphasizing the urgency for enhanced diagnostic and surveillance frameworks.
Experts involved in the study emphasized the clinical ramifications of underdiagnosis due to reliance on diagnostic codes alone. Patients presenting with distinct clinical syndromes—such as dysautonomia observed by cardiologists, metabolic derangements noted by endocrinologists, or neurocognitive impairments flagged by neurologists—often fail to receive a unifying long COVID diagnosis. Consequently, their management can become fragmented, and opportunities for targeted interventions are missed. The AI-developed surveillance mechanism thus represents a paradigm shift, offering a more granular appreciation of post-COVID conditions and helping connect diverse clinical phenotypes to prior SARS-CoV-2 infections.
According to study co-author Shawn Murphy, MD, PhD, Chief Research Information Officer at the University of Washington, this research exemplifies the transformative potential of clinical AI when thoughtfully integrated into healthcare systems. By leveraging longitudinal real-world clinical data, AI tools can elevate public health monitoring capabilities and support the consistent identification of multifaceted post-viral syndromes. This comprehensive approach may catalyze the development of tailored clinical trials and personalized therapeutic avenues, ultimately enhancing outcomes for suffering patients.
Lead author Jiazi Tian, MSc, a data scientist at Mass General Brigham, reflected on the silent scale of missed diagnoses: patients actively seeking care remain invisible to standard surveillance due to deficient coding practices. He highlighted that many long COVID manifestations arrive unlabelled, obscuring their epidemiological linkage to COVID-19. The study’s novel methodology helps illuminate these hidden patient populations, fostering a clearer understanding of the complex interplay between SARS-CoV-2 infection and chronic disease sequelae.
The ability to parse out specific organ-related and clinical manifestations of long COVID represents a crucial advance in managing a condition that spans pulmonology, cardiology, neurology, endocrinology, and beyond. As author Hossein Estiri, PhD, pointed out, enriched surveillance data enabled by AI not only improves prevalence estimates but also empowers health systems to design precision interventions. This approach promises to propel future research that can stratify long COVID patients for targeted therapeutics and mitigate the evolving epidemic of chronic post-viral morbidity.
As health systems worldwide grapple with the enduring impact of the COVID-19 pandemic, this study underscores the indispensable role of integrating advanced AI algorithms with comprehensive electronic health record systems. Such technological advancement is pivotal to accurately mapping the epidemiology of long COVID, guiding healthcare policy, and ultimately improving patient care in this persistently challenging clinical landscape.
Subject of Research: People
Article Title: Long COVID Persistence and Surveillance Gaps Across 58 US Hospitals
News Publication Date: 27-May-2026
Web References: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/10.1001/jamanetworkopen.2026.14909
References: Tian J et al. “Long COVID Persistence and Surveillance Gaps Across 58 US Hospitals” JAMA Network Open DOI: 10.1001/jamanetworkopen.2026.14909
Keywords: Long COVID, Artificial Intelligence, Public Health, Epidemiology, Infectious Diseases, SARS-CoV-2, Precision Phenotyping, Electronic Health Records, Chronic Disease, Post-Acute Sequelae, Metabolic Disorders, Dysautonomia
Tags: artificial intelligence in healthcare researchchallenges in long COVID diagnosis and reportingchronic multisystem symptoms of long COVIDelectronic health records analysis for COVIDlarge-scale COVID patient data analysislimitations of ICD U09.9 coding for COVIDlong COVID prevalence in US hospitalsmulti-hospital COVID-19 studynovel AI tools for disease surveillancepost-acute sequelae of SARS-CoV-2 infectionprecision-phenotyping algorithm for post-COVIDunderestimation of long COVID cases



