Download the 2020 Year in Review Report (PDF, 8.5 MB)

Using Natural Language Processing to Improve Autism Spectrum Disorder Research and Care

Using Natural Language Processing to Improve Autism Spectrum Disorder Research and Care

Applying algorithms on free text in electronic health records can identify criteria for autism spectrum disorder, which improves earlier detection and treatment as well as research with large-scale data.

Difficulty of accessing unstructured data for decision making

The use of electronic health records (EHRs) and other digital healthcare tools has generated a large volume of data, but it is often difficult to access and use for decision making. In healthcare, data are critical to providers in diagnosing and making informed treatment decisions. While structured health data, including data coded with a standardized code system such as SNOMED or LOINC, can more readily support analysis and decision making, unstructured data—in the form of free texts and narratives—are not easily extractable for use in care delivery. Natural language processing and other machine learning techniques convert unstructured text into structured, codified content in an automated manner for larger-scale use and for integration with other data.

How can we use that valuable information in free text notes?

Dr. Gondy Leroy of the University of Arizona decided to focus on autism spectrum disorders (ASDs) to show how extracting and coding information from free text in EHRs can lead to new insights and treatments. While the prevalence of ASD has increased dramatically in the last two decades, the causes are not well understood, with hypotheses ranging from changing diagnostic criteria to environmental factors. With new research focusing on neural, genetic, and environmental causes, there is a need to extract new types of data from patient records. Much of these data, when they do exist, are contained in free-text notes and are not readily available unless manually extracted. Dr. Leroy and her team sought to create methods and tools for leveraging existing and detailed ASD patient information in EHRs to improve ASD research and, ultimately, to improve earlier diagnosis, treatments, and cures.

The importance of this research is that the earlier you identify ASD, the earlier you can provide treatments and services. If you identify ASD at 5 years old, compared to 3-1/2 years old, it's a big difference. By catching it earlier, you can start treatment and therapy with that child sooner.”
- Dr. Leroy

Using natural language processing to improve ASD research

The team developed and evaluated natural language processing algorithms to identify ASD behaviors within free text in EHRs, labeling them with the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria for ASD. In addition, machine learning algorithms were used to label a child’s clinical record as either ASD or not. The researchers then developed a prototype user interface that highlights clinicians’ free-text sentences containing ASD DSM criteria. This study addressed a gap in EHR use in mental health, where behaviors that meet DSM criteria are frequently buried in free text. Given that children with ASD demonstrate drastically variable behaviors that qualify for the same DSM criteria, diagnosing these children is complex and may be delayed. The algorithms can be integrated in a user-friendly interface, which can help clinicians with limited expertise diagnose children. This work has the potential to improve earlier diagnosis and treatment of children with ASD and enhance research efforts for ASD. Findings from this research led to a recently awarded $1.5 million grant from the National Institute of Mental Health to expand the technology to support non-expert clinicians in identifying children at risk for autism spectrum disorder.