The Utility of Imputed Matched Sets

OBJECTIVE To compare results from high probability matched sets versus imputed matched sets across differing levels of linkage information. METHODS A series of linkages with varying amounts of available information were performed on two simulated datasets derived from multiyear motor vehicle crash (MVC) and hospital databases, where true matches were known. Distributions of high probability and imputed matched sets were compared against the true match population for occupant age, MVC county, and MVC hour. Regression models were fit to simulated log hospital charges and hospitalization status. RESULTS High probability and imputed matched sets were not significantly different from occupant age, MVC county, and MVC hour in high information settings (p > 0.999). In low information settings, high probability matched sets were significantly different from occupant age and MVC county (p < 0.002), but imputed matched sets were not (p > 0.493). High information settings saw no significant differences in inference of simulated log hospital charges and hospitalization status between the two methods. High probability and imputed matched sets were significantly different from the outcomes in low information settings; however, imputed matched sets were more robust. CONCLUSIONS The level of information available to a linkage is an important consideration. High probability matched sets are suitable for high to moderate information settings and for situations involving case-specific analysis. Conversely, imputed matched sets are preferable for low information settings when conducting population-based analyses.

[1]  Harold B. Weiss,et al.  Effect of Motor Vehicle Crashes on Adverse Fetal Outcomes , 2003, Obstetrics and gynecology.

[2]  Lawrence J Cook,et al.  Hospital charges associated with motorcycle crash factors: a quantile regression analysis , 2013, Injury Prevention.

[3]  L J Cook,et al.  Motor vehicle crash characteristics and medical outcomes among older drivers in Utah, 1992-1995. , 2000, Annals of emergency medicine.

[4]  K. Fielding,et al.  Increased risk of default among previously treated tuberculosis cases in the Western Cape Province, South Africa. , 2012, The international journal of tuberculosis and lung disease : the official journal of the International Union against Tuberculosis and Lung Disease.

[5]  L J Cook,et al.  Probabilistic Record Linkage: Relationships between File Sizes, Identifiers, and Match Weights , 2001, Methods of Information in Medicine.

[6]  Jacques Pouchot,et al.  Non response, incomplete and inconsistent responses to self-administered health-related quality of life measures in the general population: patterns, determinants and impact on the validity of estimates — a population-based study in France using the MOS SF-36 , 2013, Health and Quality of Life Outcomes.

[7]  Matthew A. Jaro,et al.  Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .

[8]  M L Barer,et al.  Creating a Population-based Linked Health Database: A New Resource for Health Services Research , 1998, Canadian journal of public health = Revue canadienne de sante publique.

[9]  I. Jacobs,et al.  A comparison of metropolitan vs rural major trauma in Western Australia. , 2011, Resuscitation.

[10]  W. Oberaigner Errors in Survival Rates Caused by Routinely Used Deterministic Record Linkage Methods , 2007, Methods of Information in Medicine.

[11]  Henry E. Wang,et al.  Out-of-hospital endotracheal intubation experience and patient outcomes. , 2010, Annals of emergency medicine.

[12]  J M Dean,et al.  Probabilistic linkage of computerized ambulance and inpatient hospital discharge records: a potential tool for evaluation of emergency medical services. , 2001, Annals of emergency medicine.

[13]  Howard B. Newcombe,et al.  Handbook of record linkage: methods for health and statistical studies, administration, and business , 1988 .

[14]  Andrea M. Thomas,et al.  Comparison of factors influencing emergency department visits and hospitalization among drivers in work and nonwork-related motor vehicle crashes in Utah, 1999-2005. , 2011, Accident; analysis and prevention.

[15]  Andrea M. Thomas,et al.  Identifying Work-Related Motor Vehicle Crashes in Multiple Databases , 2012, Traffic injury prevention.

[16]  B. Tefft,et al.  Prevalence of motor vehicle crashes involving drowsy drivers, United States, 1999-2008. , 2012, Accident; analysis and prevention.

[17]  Lawrence J Cook,et al.  Graduated driver licensing in Utah: is it effective? , 2005, Annals of emergency medicine.

[18]  R. Pietrobon,et al.  Use of endovascular therapy for peripheral arterial lesions: an analysis of the National Trauma Data Bank from 2007 to 2009. , 2013, Annals of vascular surgery.

[19]  G. Bonsel,et al.  An Efficient Validation Method of Probabilistic Record Linkage Including Readmissions and Twins , 2008, Methods of Information in Medicine.

[20]  Lawrence J Cook,et al.  Driver seat belt use indicates decreased risk for child passengers in a motor vehicle crash. , 2010, Accident; analysis and prevention.

[21]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[22]  Nathan Kuppermann,et al.  Evaluating the use of existing data sources, probabilistic linkage, and multiple imputation to build population-based injury databases across phases of trauma care. , 2012, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[23]  Damien Jolley,et al.  Prognostic models based on administrative data alone inadequately predict the survival outcomes for critically ill patients at 180 days post-hospital discharge. , 2012, Journal of critical care.

[24]  Gary A. Smith,et al.  The impact of a standard enforcement safety belt law on fatalities and hospital charges in Ohio. , 2010, Journal of safety research.

[25]  M. Bell,et al.  THE URGE TO MERGE: A COMPUTATIONAL METHOD FOR LINKING DATASETS WITH NO UNIQUE IDENTIFIER , 1993 .

[26]  C. Newgard,et al.  Variation in Prehospital Use and Uptake of the National Field Triage Decision Scheme , 2013, Prehospital emergency care : official journal of the National Association of EMS Physicians and the National Association of State EMS Directors.

[27]  Smith Me Record linkage: present status and methodology. , 1984 .

[28]  Matthew A. Jaro,et al.  Probabilistic linkage of large public health data files. , 1995, Statistics in medicine.

[29]  A Wajda,et al.  Record Linkage Strategies , 1991, Methods of Information in Medicine.