Semi-Automated Identification of Biomedical Literature
Project Final Report (PDF, 2.35 MB) Disclaimer
Disclaimer
Disclaimer details
As the body of academic research literature increases, tools that leverage machine learning, including natural language processing, information retrieval, and text mining methods, can support and optimize the search and screening steps of systematic reviews, thereby improving comprehensiveness and reducing manual screening burden.
Project Details -
Completed
-
Grant NumberR03 HS027247
-
Funding Mechanism(s)
-
AHRQ Funded Amount$99,117
-
Principal Investigator(s)
-
Organization
-
LocationProvidenceRhode Island
-
Project Dates09/30/2019 - 09/29/2020
The use of systematic reviews to inform evidence-based medicine and patient-centered comparative effectiveness research is common, but time-consuming and resource-intensive. With the explosion of publications, including approximately 100 new trials published daily, the current methodology of developing a search strategy, and then manually screening citations to include in the review, is inefficient and may miss relevant publications.
For this study, the research team used machine learning technologies, including natural language processing, information retrieval, and text mining methods, to optimize the efficiency of the systematic review process and mitigate the challenges associated with information overload in the literature identification process. The team developed a literature identification process that unifies the query formulation and citation screening steps and uses modern approaches for text encoding (dense text embeddings) to represent the text of the citations in a form that can be used by information-retrieval and machine-learning algorithms.
The specific aims of the study were as follows:
- Develop a system for semi-automating the development of literature searches and the identification of relevant literature in systematic reviews.
- Prospectively and retrospectively cross-evaluate the system by comparing it to the standard approach for two systematic reviews.
The system took as input a set of questions written in natural language and used a set of machine-learning algorithms to rank all of PubMed's citations based on relevance to each question. It then returned the top-ranked 100 citations. The first 100 articles were exported and screened manually, with the manual screener adjudicating the relevance of each abstract and tagging words that indicate relevance in relevant abstracts. These “curated” articles were then used by the system to refine the search and re-rank the abstracts, and a new set of 100 relevant abstracts was exported and screened/tagged until convergence (i.e., no other relevant abstracts could be retrieved) or for a certain number of iterations (batches), which the research team set to 10.
System performance was assessed, using seven ongoing or completed systematic reviews (three prospectively and four retrospectively). The sensitivity of the system to identify the relevant articles varied across reviews from a low of 0.08 to a high of 0.58. For nearly all reviews, the precision was drastically improved compared to the standard procedure of separate searching and manual screening, ranging from 0.01 to 0.09 (number needed to read [NNR] 87 to 11) as compared to 0.006 to 0.083 (NNR 143 to 12) for the standard two-step process. Investigators examined potential factors that might affect sensitivity, but neither study design, study size, nor specific key question significantly affected retrieval across the reviews. Future research should explore ways to encode domain knowledge in query formulation, possibly by incorporating a "reasoning" aspect to the system to elicit more contextual information and leveraging ontologies and knowledge bases to better enrich the questions used in the search.
Disclaimer
Disclaimer details