A Machine Learning Health System to Integrate Care for Substance Misuse and HIV Treatment and Prevention Among Hospitalized Patients
Project Final Report (PDF, 209.8 KB) Disclaimer
Disclaimer
Disclaimer details
Using machine learning (ML) and natural language processing (NLP) on hospital clinical notes showed moderate success in identifying patients at high risk for HIV, demonstrating the potential of automating risk screening in acute care settings while underscoring the need for better documentation to support timely, scalable HIV prevention and care.
Project Details -
Completed
-
Grant NumberR21 HS028511
-
Funding Mechanism(s)
-
Project Amount$296,576
-
Principal Investigator(s)
-
Organization
-
LocationChicagoIllinois
-
Project Dates09/01/2021 - 08/31/2024
-
Care Setting
-
Medical Condition
-
Population
-
Type of Care
-
Health Care Theme
The overlap between substance misuse and HIV is a longstanding and growing public health concern, intensified by the opioid epidemic. Since the 1990s, rising overdose deaths and increased HIV transmission have underscored the need for targeted prevention strategies. This is especially true for people who inject drugs, who face an HIV prevalence of 17 percent—far higher than the 0.34 percent observed in the general population. Hospitalization presents a critical opportunity to engage patients in care, yet these moments are often missed due to the fast pace of acute care and the impracticality of manual screening in busy settings. This research explored whether ML–based clinical decision support (CDS) tools could help automate HIV risk identification and connect more patients to timely testing, prevention, and care.
Researchers developed, trained, and evaluated an ML-based classifier—a tool that learns from data to automatically categorize or label new information—using NLP to digitally triage hospitalized patients into HIV risk categories. By analyzing past cases using electronic health record (EHR) data, such as demographics, diagnoses, and lab results, the tool aimed to identify individuals at increased risk of acquiring or transmitting HIV, particularly those with substance use disorders or high-risk sexual behaviors. Researchers conducted the study at Rush University Medical Center, which serves communities with some of the highest heroin overdose death rates in Chicago, and their aim is to improve the reach and efficiency of HIV prevention in acute care settings by automating the risk identification process.
The specific aims of the research were as follows:
- Develop, train, and test an ML classifier to identify risk for HIV acquisition or transmission among hospitalized patients with substance misuse.
- Integrate the ML classifier in the EHR infrastructure to test predictive validity and test classifier efficacy.
- Improve the HIV prevention framework.
- Address critical gaps in risk assessment.
The research team tested several ML models using structured EHR data like diagnoses, lab results, and demographics, and unstructured information from clinical notes, analyzed with NLP. The goal was to classify hospitalized patients into four levels of HIV risk, with a focus on identifying those most at risk of acquiring or transmitting the virus. The most effective model, a convolutional neural network, predicted a patient’s risk within the first 24 hours of admission. To assess accuracy, researchers compared the model’s predictions to past patient records reviewed by clinical experts, helping to evaluate its real-world potential.
The ML classifier represents a significant step toward proactive HIV risk identification and intervention. It showed moderate success in identifying HIV risk using EHR data from the first 24 hours of hospitalization and performed especially well in ruling out low-risk cases. While it was less effective at detecting the highest-risk individuals, it still flagged more than half of them, demonstrating both the potential and limitations of automated screening tools in real-world settings. The researchers emphasized the need for improved clinical documentation to support more accurate risk detection. To encourage broader adoption, they are planning to make the computer code and ML algorithm publicly available on GitHub, a popular platform for sharing and collaborating on software projects. They also plan to integrate the tool into Rush Hospital’s clinical workflows as part of a broader CDS approach. If validated and scaled, the tool could serve as a model for other hospitals to support more consistent, systemwide HIV prevention, aligning with public health goals to increase early case detection and reduce HIV transmission in high-prevalence communities.
Disclaimer
Disclaimer details
