Linking Data for Mothers and Babies in De-Identified Electronic Health Data

Objective Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England. Design and Setting Retrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013. Results Of 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England. Conclusion Probabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.

[1]  Harvey Goldstein,et al.  Methodological Developments in Data Linkage: Harron/Methodological Developments in Data Linkage , 2015 .

[2]  Helen Pearson,et al.  Massive UK baby study cancelled , 2015, Nature.

[3]  David Moher,et al.  The REporting of Studies Conducted Using Observational Routinely-Collected Health Data (RECORD) Statement: Methods for Arriving at Consensus and Developing Reporting Guidelines , 2015, PloS one.

[4]  Harvey Goldstein,et al.  Methodological Developments in Data Linkage , 2015 .

[5]  B. Mol,et al.  Fetal Gender of the First Born and the Recurrent Risk of Spontaneous Preterm Birth , 2015, American Journal of Perinatology.

[6]  Harvey Goldstein,et al.  Identifying Possible False Matches in Anonymized Hospital Administrative Data without Patient Identifiers. , 2015, Health services research.

[7]  K. Wisner,et al.  Impact of prenatal exposure to psychotropic drugs on neonatal outcome in infants of mothers with serious psychiatric illnesses. , 2015, The Journal of clinical psychiatry.

[8]  T. Gomes,et al.  Antipsychotic drug use in pregnancy: high dimensional, propensity matched, population based cohort study , 2015, BMJ : British Medical Journal.

[9]  I. Petersen,et al.  Association between Antibiotic Prescribing in Pregnancy and Cerebral Palsy or Epilepsy in Children Born at Term: A Cohort Study Using The Health Improvement Network , 2015, PloS one.

[10]  C. Oliver‐Williams,et al.  Previous miscarriage and the subsequent risk of preterm birth in Scotland, 1980–2008: a historical cohort study , 2015, BJOG : an international journal of obstetrics and gynaecology.

[11]  B. Mol,et al.  797: Fetal gender of the first born and the recurrent risk of spontaneous preterm birth , 2015 .

[12]  M. Quigley,et al.  School performance at age 7 years in late preterm and early term birth: a cohort study , 2014, Archives of Disease in Childhood: Fetal and Neonatal Edition.

[13]  H. Goldstein,et al.  Evaluating bias due to data linkage error in electronic healthcare records , 2014, BMC Medical Research Methodology.

[14]  C. Roberts,et al.  Risk factors and costs of hospital admissions in first year of life: a population-based study. , 2013, The Journal of pediatrics.

[15]  H. Pan,et al.  Birth weight and longitudinal growth in infants born below 32 weeks’ gestation: a UK population study , 2013, Archives of Disease in Childhood: Fetal and Neonatal Edition.

[16]  Karin E. Johnson,et al.  Methods of linking mothers and infants using health plan data for studies of pregnancy outcomes , 2013, Pharmacoepidemiology and drug safety.

[17]  S. Langan,et al.  Call to RECORD: the need for complete reporting of research using routinely collected health data. , 2013, Journal of clinical epidemiology.

[18]  M. Walsh,et al.  Ten-Year Review of Major Birth Defects in VLBW Infants , 2013, Pediatrics.

[19]  Sean M. Randall,et al.  The effect of data cleaning on record linkage quality , 2013, BMC Medical Informatics and Decision Making.

[20]  A. Majeed,et al.  Quality of routine hospital birth records and the feasibility of their use for creating birth cohorts. , 2013, Journal of public health.

[21]  I. Gurol-Urganci,et al.  Evaluating maternity care using national administrative health datasets: How are statistics affected by the quality of data on method of delivery? , 2013, BMC Health Services Research.

[22]  Harvey Goldstein,et al.  The analysis of record‐linked data using multiple imputation with data value priors , 2012, Statistics in medicine.

[23]  H. Goldstein,et al.  Correction , 2012, Journal of Epidemiology & Community Health.

[24]  C. Roberts,et al.  Quality of Data in Perinatal Population Health Databases: A Systematic Review , 2012, Medical care.

[25]  C. Stark,et al.  Family size and perinatal circumstances, as mental health risk factors in a Scottish birth cohort , 2012, Social Psychiatry and Psychiatric Epidemiology.

[26]  A. Macfarlane,et al.  Linking maternity data for England, 2005‐06: methods and data quality , 2011, Health statistics quarterly.

[27]  C. Lowery,et al.  Late Preterm Infants: Birth Outcomes and Health Care Utilization in the First Year , 2010, Pediatrics.

[28]  J. Pell,et al.  Gestational Age at Delivery and Special Educational Need: Retrospective Cohort Study of 407,503 Schoolchildren , 2010, PLoS medicine.

[29]  Adrian F Hernandez,et al.  Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. , 2009, American heart journal.

[30]  L. Taylor,et al.  Characteristics of unmatched maternal and baby records in linked birth records and hospital discharge data. , 2006, Paediatric and perinatal epidemiology.

[31]  L. Appleby,et al.  The psychosocial outcome of pregnancy in women with psychotic disorders , 2004, Schizophrenia Research.

[32]  T. Blakely,et al.  Probabilistic record linkage and a method to calculate the positive predictive value. , 2002, International journal of epidemiology.