In recent years, the intersection of healthcare and artificial intelligence has ushered in transformative approaches to patient care, and the latest advancement from researchers at the MUSC Hollings Cancer Center exemplifies this shift. Pioneered by Jihad Obeid, M.D., and Mario Fugal, Ph.D., their team has developed a cutting-edge natural language processing (NLP) model designed to decode and classify complex medical narratives within patient records. This breakthrough specifically targets the challenges of identifying the primary cancer diagnosis in patients undergoing stereotactic radiosurgery (SRS) for brain metastases—a critical factor in tailoring effective therapeutic strategies.
Brain metastases, secondary tumors originating from cancers elsewhere in the body such as the lung, breast, skin, kidney, or digestive tract, pose intricate clinical dilemmas. The brain’s delicate architecture necessitates precision in radiation therapy, particularly with SRS, which delivers a concentrated dose in a one-time session. However, the efficacy and safety of SRS rely heavily on understanding the tumor’s lineage. Some cancers, like those rooted in lung tissue, exhibit high radiosensitivity and respond favorably to lower radiation doses, while others, including renal cancers, demonstrate resistance, demanding alternative dosing and treatment regimens. Accurately pinpointing the origin of brain metastases is therefore paramount to minimizing collateral damage and optimizing patient outcomes.
Historically, clinicians have grappled with the unstructured and often inconsistent format of medical records, especially when vital information is buried within extensive free-text clinical notes. Despite the existence of standardized coding systems such as the International Classification of Diseases (ICD), these codes frequently fall short in capturing the nuanced details necessary for specialized cancer treatments. ICD codes tend to be too broad, failing to distinguish between subtypes or the precise anatomical location of the primary tumor, which are essential variables in personalized treatment planning.
.adsslot_9wYfhQOdyM{ width:728px !important; height:90px !important; }
@media (max-width:1199px) { .adsslot_9wYfhQOdyM{ width:468px !important; height:60px !important; } }
@media (max-width:767px) { .adsslot_9wYfhQOdyM{ width:320px !important; height:50px !important; } }
ADVERTISEMENT
The MUSC research team circumvented this bottleneck by leveraging NLP, a sub-discipline of artificial intelligence focused on enabling machines to interpret human language. By training an algorithm to recognize semantic patterns, keywords, and contextual clues embedded in clinical notes, the model discerns specific cancer types and subtypes with unprecedented accuracy. For instance, terms like “ductal” signal breast cancer, whereas “melanoma” indicates skin cancer. This semantic precision allows for a more detailed and patient-specific cancer classification beyond the capabilities of conventional coding.
This NLP model was rigorously evaluated using a vast dataset comprising over 82,000 radiation oncology notes from the electronic health records (EHRs) of more than 1,400 patients treated with SRS for brain metastases. The performance of the NLP system was benchmarked against ground truth annotations manually verified by expert reviewers, confirming its ability to extract primary cancer diagnoses with over 90% accuracy overall. Remarkably, for prevalent cancers such as those of the lung, breast, and skin, the model’s classification accuracy soared to nearly 97%, including the precise identification of lung cancer subtypes—an achievement beyond the purview of ICD coding.
One of the compelling facets of this development is the model’s operational simplicity and scalability. Unlike more computationally intensive AI innovations, this approach does not demand expansive datasets or heavy resource investment. Importantly, it avoids the ethical and privacy concerns often associated with complex generative AI systems, positioning it as an immediately deployable tool for a wide range of healthcare settings, including those with limited infrastructural capacity.
The clinical implications of integrating such a model are profound. By automating the extraction of relevant diagnostic information from unstructured physician notes, the technology expedites the data availability that oncologists need for timely decision-making. This acceleration can significantly reduce the latency between diagnosis and treatment, thereby enhancing patient outcomes. Furthermore, systematically captured, high-fidelity data can underpin more robust research studies and clinical trials, fostering a cycle of continuous improvement in cancer care.
Looking forward, the MUSC team is extending this NLP framework to address other pressing clinical challenges, such as early detection of radiation necrosis—a serious, albeit rare, inflammatory side effect marked by brain swelling following radiation therapy. Identifying patients at heightened risk for such complications can enable preemptive interventions or adjustments to treatment protocols, mitigating harm and improving quality of life.
Moreover, the adaptability of the NLP model holds promise for integration with multimodal healthcare data streams. Combining unstructured clinical narratives with imaging data, laboratory results, or genomic information could yield richer, multidimensional insights into cancer biology and patient prognosis. This multidisciplinary data fusion represents the vanguard of precision oncology and offers a roadmap toward truly personalized medicine.
At its core, this research embodies a broader paradigm shift within healthcare: repurposing electronic health records from static repositories to dynamic, analyzable datasets capable of informing real-time clinical decisions. By harnessing AI-driven tools like NLP, clinicians can transcend the limitations of current documentation formats, transforming the vast expanse of textual data into actionable knowledge that benefits both patients and providers.
As cancer treatments grow increasingly sophisticated and individualized, tools that bridge the gap between raw clinical documentation and precise medical understanding will become indispensable. The MUSC Hollings Cancer Center’s NLP model demonstrates how targeted AI applications can catalyze this transformation, ensuring that technological advances translate directly into improved patient care without adding to the burdens shouldered by healthcare professionals.
Subject of Research: People
Article Title: Classifying Stereotactic Radiosurgery Patients by Primary Diagnosis Using Natural Language Processing
News Publication Date: 13-Jun-2025
Web References:
https://ascopubs.org/doi/10.1200/CCI-24-00268
https://hollingscancercenter.musc.edu/
References:
Jihad Obeid, M.D., Mario Fugal, Ph.D., et al. “Classifying Stereotactic Radiosurgery Patients by Primary Diagnosis Using Natural Language Processing.” JCO Clinical Cancer Informatics, 13 June 2025.
Image Credits:
Medical University of South Carolina / Photo by Clif Rhodes
Keywords: Cancer, Brain cancer, Artificial intelligence, Natural language processing
Tags: artificial intelligence in medical practicebrain metastases diagnosis challengescancer treatment optimization strategiesenhancing patient outcomes with AIidentifying primary cancer originsimproving therapeutic strategies for cancerMUSC Hollings Cancer Center researchnatural language processing in healthcareNLP applications in oncologypatient record analysis using NLPprecision radiation therapy techniquesstereotactic radiosurgery advancements