Studying the Accuracy of Symptom-Checker App in Diagnosing Strokes in a Real-World Setting

Theme:

Engaging and Empowering Patients

Subtheme:

Technology Solutions to Engage Patients and Their Families in Care

Rigorously evaluating a popular, patient-facing symptom-checker app will improve understanding around app safety, accuracy, and ability to prompt patients to seek care.

Millions use healthcare apps for diagnosis, but little research proves their accuracy in practice

Many Americans rely on healthcare apps like WebMD or SymptomMD to identify initial diagnoses of their symptoms. While this technology increases our access to healthcare information, these symptom-checker apps are unregulated and offer no safeguards or assurances that diagnostic information is accurate. Misdiagnosis of serious conditions can lead to poor medical outcomes or even death, especially if patients stay home rather than seek emergency care.

These symptom-checker apps have not been rigorously evaluated; past studies reviewed symptom-checker performance against common and typical disease presentations compiled by clinicians rather than directly from patient input. These descriptions are edited for clarity by clinicians but may not accurately reflect how patients describe their symptoms.

Symptom-checking apps will be tested in a real-world setting

Dr. Hamish Fraser, an associate professor of medical science and health systems policy and practice at Brown University’s Center for Biomedical Informatics, wants to address this gap in the research. He and a team are studying how accurately the popular symptom-checker app Ada Health can diagnose urgent or emergency complaints including stroke, based on patient input in real-world settings.

The researchers chose to focus on a stroke diagnosis because the symptoms often are confusing and timing is critical for seeking treatment—the golden window for effective stroke treatment is within 4 hours after it begins. A misdiagnosis that delays treatment can be catastrophic.

“It’s absolutely critical that we use (the apps) in real patients in real-world situations, exactly as the real world operates, because the situation can be very, very different from a lab test.” – Dr. Hamish Fraser

To study the performance of Ada Health’s symptom-checker app, the team is comparing app derived results among patients in the emergency department (ED) or at an urgent primary care visit against actual diagnoses.

Real patients, including those with a possible stroke, will enter into the app the symptoms for which they are seeking treatment to attain the app’s “diagnosis” and treatment recommendations. The patient-reported information also will be evaluated by physicians who are not treating the patient to establish a third-party clinical diagnosis. These two diagnoses and triage recommendations will be compared with the actual diagnoses and triage outputs provided by the treating clinicians. The clinical data collected in the study will also be used to test the WebMD symptom checker and different versions of the ChatGPT language models.

Using machine learning may improve accuracy in assessing symptoms

In addition, Dr. Fraser and the research team will couple data from 2,300 patients who have presented with symptoms of stroke at the ED with machine learning techniques to create new algorithms to improve early stroke diagnosis and risk stratification. Outputs will be compared with existing symptom checkers, algorithms, and guidelines on diagnosis and management of stroke.

These research results will improve understanding of symptom-checker safety, helping patients, particularly high-risk ones, to better recognize urgent symptoms and seek care, as well as enhance health service use for a broad range of patients, including those with stroke. The results also could have broader policy implications, including better regulation of new large language model apps and other technological tools. It’s important for researchers to measure the impact of these apps on health to provide guidance to policymakers.