Enhancing Patient Matching in Support of Operational Health Information Exchange
Project Final Report (PDF, 3.12 MB) Disclaimer
Disclaimer
Disclaimer details
Enhancements in matching patient data across disparate health data sources will expand the integration of data, better research data, and give those caring for patients more comprehensive patient health information improving the quality and safety of patient care.
Project Details -
Completed
-
Grant NumberR01 HS023808
-
Funding Mechanism(s)
-
AHRQ Funded Amount$1,701,413
-
Principal Investigator(s)
-
Organization
-
LocationIndianapolisIndiana
-
Project Dates07/01/2017 - 04/29/2023
-
Technology
-
Care Setting
-
Population
-
Health Care Theme
Patient health data are scattered across numerous databases coming from all the places a given patient receives care. Bringing that data together – needed for both clinical care and for research purposes – is difficult due to the lack of a unique patient identifier and the resultant need to use strategies to match data. While recommendations exist around patient matching, no consensus has emerged and there is little peer-reviewed literature to provide an evidence base around feasibility, effectiveness, and generalizability of implementation. In general, there are three recommended approaches for patient matching: 1) adding more fields to match, which increases the power of discriminating one patient from another; 2) standardizing and adopting uniform data fields for consistency; and 3) assessing and improving matching field accuracy and completeness.
This research implemented and evaluated emerging consensus-based recommendations to match patient data across disparate sources to create evidence to inform a national patient identity management strategy. Patient matching strategies were evaluated in the context of four use cases around newborns, health systems, public health, and deaths. For each use case, the researchers manually reviewed “gold standard” matching datasets from one of the Nation’s most comprehensive health information exchanges, the Indiana Network for Patient Care (INPC). Each of the datasets had been used previously with peer reviewed patient matching research. INPC’s existing patient matching algorithms served as the research team’s baseline.
The specific aims of this project were as follows:
- Implement three general classes of recommended matching data enhancements and measure the resulting matching accuracy improvements.
- Implement four novel matching algorithm enhancements and assess the resulting matching accuracy improvements.
- Measure the matching accuracy improvements resulting from using combinations of 1) three best practice matching policy recommendations and 2) four novel matching algorithm enhancements.
Linkage methods (describing how data from multiple datasets are connected and the accuracy of such connections) were applied to the data and enhancements were implemented and assessed. These enhancements were in the areas of handling missing data, considering conditional dependence, incorporating nearness comparison, and applying data standardization. Key findings included the value of token frequency (i.e. how often a specific linguistic unit appears in text) in matching records, the importance of accounting for conditional dependence (i.e., accounting for the underlying correlation between agreeing fields), and the benefits of data standardization and similarity measures. Managing missing data using the missing at random method (i.e., a model that assumes that the probability of missing data is only dependent on other observed data) significantly improved match accuracy.
This research showed that enhancements to current patient matching strategies improve the accuracy of integrating patient data and identified opportunities for improving linkage matching methods and incorporating methods for accommodating missingness. The researchers plan to expand upon this work in the future by looking at the performance of matching among racial, ethnic, and other demographic groups, including conducting research into algorithmic bias.
Disclaimer
Disclaimer details