Estimating Causal Effects in Observational Studies using Electronic Health Data: Challenges and (Some) Solutions

Electronic health data sets, including electronic health records (EHR) and other administrative databases, are rich data sources that have the potential to help answer important questions about the effects of clinical interventions as well as policy changes. However, analyses using such data are almost always non-experimental, leading to concerns that those who receive a particular intervention are likely different from those who do not in ways that may confound the effects of interest. This paper outlines the challenges in estimating causal effects using electronic health data and offers some solutions, with particular attention paid to propensity score methods that help ensure comparisons between similar groups. The methods are illustrated with a case study describing the design of a study using Medicare and Medicaid administrative data to estimate the effect of the Medicare Part D prescription drug program on individuals with serious mental illness.

[1]  R. Tannen,et al.  Use of primary care electronic medical record database in drug efficacy research on cardiovascular outcomes: comparison of database and randomised controlled trial findings , 2009, BMJ : British Medical Journal.

[2]  Anil Jain,et al.  The risk of developing coronary artery disease or congestive heart failure, and overall mortality, in type 2 diabetic patients receiving rosiglitazone, pioglitazone, metformin, or sulfonylureas: a retrospective analysis , 2009, Acta Diabetologica.

[3]  Peter M. Steiner,et al.  The importance of covariate selection in controlling for selection bias in observational studies. , 2010, Psychological methods.

[4]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[5]  A. Majeed,et al.  Effect of a UK Pay-for-Performance Program on Ethnic Disparities in Diabetes Outcomes: Interrupted Time Series Analysis , 2012, The Annals of Family Medicine.

[6]  J. Newhouse,et al.  Econometrics in outcomes research: the use of instrumental variables. , 1998, Annual review of public health.

[7]  Patrick H. Conway,et al.  Value-based purchasing--national programs to move from volume to value. , 2012, The New England journal of medicine.

[8]  Magdalena Cerdá,et al.  Effect of the 2010 Chilean Earthquake on Posttraumatic Stress: Reducing Sensitivity to Unmeasured Bias Through Study Design , 2013, Epidemiology.

[9]  Douglas Faries,et al.  Analysis of Treatment Effectiveness in Longitudinal Observational Data , 2007, Journal of biopharmaceutical statistics.

[10]  Elizabeth A Stuart,et al.  Improving propensity score weighting using machine learning , 2010, Statistics in medicine.

[11]  P. Holland Statistics and Causal Inference , 1985 .

[12]  Lorenzo Moreno,et al.  Propensity Score Matching , 2008 .

[13]  B J McNeil,et al.  Does more intensive treatment of acute myocardial infarction in the elderly reduce mortality? Analysis using instrumental variables. , 1994, JAMA.

[14]  Robert F. Boruch,et al.  Standards of Evidence: Criteria for Efficacy, Effectiveness and Dissemination , 2005, Prevention Science.

[15]  Gary King,et al.  Misunderstandings between experimentalists and observationalists about causal inference , 2008 .

[16]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[17]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[18]  John W. Kingdon Agendas, alternatives, and public policies , 1984 .

[19]  James M. Robins,et al.  Observational Studies Analyzed Like Randomized Experiments: An Application to Postmenopausal Hormone Therapy and Coronary Heart Disease , 2008, Epidemiology.

[20]  L. Garrison,et al.  Implications of Part D for mentally ill dual eligibles: a challenge for Medicare. , 2006, Health affairs.

[21]  M. Duggan Do new prescription drugs pay for themselves? The case of second-generation antipsychotics. , 2005, Journal of health economics.

[22]  Daria Eremina,et al.  The Importance of Clinical Variables in Comparative Analyses Using Propensity-Score Matching , 2012, PharmacoEconomics.

[23]  Elizabeth A Stuart,et al.  Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. , 2010, Psychological methods.

[24]  Edm Forum Getting Answers We Can Believe In: Methodological Considerations When Using Electronic Clinical Data for Research , 2012 .

[25]  Gary King,et al.  MatchIt: Nonparametric Preprocessing for Parametric Causal Inference , 2011 .

[26]  B. Wells,et al.  Increase in overall mortality risk in patients with type 2 diabetes receiving glipizide, glyburide or glimepiride monotherapy versus metformin: a retrospective analysis , 2012, Diabetes, obesity & metabolism.

[27]  J. Robins,et al.  Effect of highly active antiretroviral therapy on time to acquired immunodeficiency syndrome or death using marginal structural models. , 2003, American journal of epidemiology.

[28]  J. Robins,et al.  Estimating the causal effect of zidovudine on CD4 count with a marginal structural model for repeated measures , 2002, Statistics in medicine.

[29]  R. Horwitz The planning of observational studies of human populations , 1979 .

[30]  Elizabeth A. Stuart,et al.  Estimating Causal Effects Using School-Level Data Sets , 2007 .

[31]  M Alan Brookhart,et al.  Instrumental variables I: instrumental variables exploit natural variation in nonexperimental data to estimate causal relationships. , 2009, Journal of clinical epidemiology.

[32]  David R. Holtgrave,et al.  Alternatives to the randomized controlled trial. , 2008, American journal of public health.

[33]  Byoung-Gie Kim,et al.  Single port access laparoscopic adnexal surgery versus conventional laparoscopic adnexal surgery: a comparison of peri-operative outcomes. , 2010, European journal of obstetrics, gynecology, and reproductive biology.

[34]  W. Willett,et al.  Coffee and alcohol consumption and the risk of pancreatic cancer in two prospective United States cohorts. , 2001, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[35]  P. Rosenbaum Choice as an Alternative to Control in Observational Studies , 1999 .

[36]  M Alan Brookhart,et al.  Instrumental variables II: instrumental variable application-in 25 variations, the physician prescribing preference generally was strong and reduced covariate imbalance. , 2009, Journal of clinical epidemiology.

[37]  George Hripcsak,et al.  Caveats for the use of operational electronic health record data in comparative effectiveness research. , 2013, Medical care.

[38]  Ying Luo,et al.  [Propensity score matching in SPSS]. , 2015, Nan fang yi ke da xue xue bao = Journal of Southern Medical University.

[39]  Roger Logan,et al.  Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease , 2013, Statistical methods in medical research.

[40]  Elizabeth A. Stuart,et al.  An Introduction to Sensitivity Analysis for Unobserved Confounding in Nonexperimental Prevention Research , 2013, Prevention Science.

[41]  Bruce H Fireman,et al.  Confounding Adjustment in Comparative Effectiveness Research Conducted Within Distributed Research Networks , 2013, Medical care.

[42]  Emanuel Raschi,et al.  Drug‐induced torsades de pointes: data mining of the public version of the FDA Adverse Event Reporting System (AERS) , 2009, Pharmacoepidemiology and drug safety.

[43]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.