Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available.

Ideally, questions about comparative effectiveness or safety would be answered using an appropriately designed and conducted randomized experiment. When we cannot conduct a randomized experiment, we analyze observational data. Causal inference from large observational databases (big data) can be viewed as an attempt to emulate a randomized experiment-the target experiment or target trial-that would answer the question of interest. When the goal is to guide decisions among several strategies, causal analyses of observational data need to be evaluated with respect to how well they emulate a particular target trial. We outline a framework for comparative effectiveness research using big data that makes the target trial explicit. This framework channels counterfactual theory for comparing the effects of sustained treatment strategies, organizes analytic approaches, provides a structured process for the criticism of observational studies, and helps avoid common methodologic pitfalls.

[1]  J. Robins,et al.  Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers , 1992 .

[2]  J. Robins,et al.  Estimation and extrapolation of optimal treatment and testing strategies , 2008, Statistics in medicine.

[3]  Stacey A. Kenfield,et al.  Immediate versus deferred initiation of androgen deprivation therapy in prostate cancer patients with PSA-only relapse. An observational follow-up study. , 2015, European journal of cancer.

[4]  J. Avorn,et al.  Increasing Levels of Restriction in Pharmacoepidemiologic Database Studies of Elderly and Comparison With Randomized Trial Results , 2007, Medical care.

[5]  M. Robins James,et al.  Estimation of the causal effects of time-varying exposures , 2008 .

[6]  W. Ray,et al.  Evaluating medication effects outside of clinical trials: new-user designs. , 2003, American journal of epidemiology.

[7]  Ian Harvey,et al.  A pragmatic–explanatory continuum indicator summary (PRECIS): a tool to help trial designers , 2009, Canadian Medical Association Journal.

[8]  M. Hernán,et al.  Compound Treatments and Transportability of Causal Inference , 2011, Epidemiology.

[9]  M. Hudgens,et al.  Toward Causal Inference With Interference , 2008, Journal of the American Statistical Association.

[10]  J. Robins,et al.  A Structural Approach to Selection Bias , 2004, Epidemiology.

[11]  Gary King,et al.  Inference in Case-Control Studies , 2004 .

[12]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[13]  Susan Gruber,et al.  Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets , 2015, Statistics in medicine.

[14]  James M Robins,et al.  When to Initiate Combined Antiretroviral Therapy to Reduce Mortality and AIDS-Defining Illness in HIV-Infected Persons in Developed Countries , 2011, Annals of Internal Medicine.

[15]  James M. Robins,et al.  The International Journal of Biostatistics CAUSAL INFERENCE When to Start Treatment ? A Systematic Approach to the Comparison of Dynamic Regimes Using Observational Data , 2011 .

[16]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[17]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[18]  Marsha A Raebel,et al.  Design considerations, architecture, and use of the Mini‐Sentinel distributed data system , 2012, Pharmacoepidemiology and drug safety.

[19]  James M. Robins,et al.  Observational Studies Analyzed Like Randomized Experiments: An Application to Postmenopausal Hormone Therapy and Coronary Heart Disease , 2008, Epidemiology.

[20]  M. Hernán,et al.  Comparative Effectiveness of Two Anemia Management Strategies for Complex Elderly Dialysis Patients , 2014, Medical care.

[21]  Marshall M Joffe,et al.  History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens , 2005 .

[22]  Ian Harvey,et al.  A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. , 2009, Journal of clinical epidemiology.

[23]  K. Saunders,et al.  Relationship of Opioid Use and Dosage Levels to Fractures in Older Chronic Pain Patients , 2010 .

[24]  Tyler J VanderWeele,et al.  On causal inference in the presence of interference , 2012, Statistical methods in medical research.

[25]  J. Pearl,et al.  Causal Inference , 2011, Twenty-one Mental Models That Can Change Policing.

[26]  J. Avorn,et al.  Paradoxical Relations of Drug Treatment with Mortality in Older Persons , 2001, Epidemiology.

[27]  M. Lipsitch,et al.  Negative Controls: A Tool for Detecting Confounding and Bias in Observational Studies , 2010, Epidemiology.

[28]  Mark J van der Laan,et al.  History-adjusted marginal structural models for estimating time-varying effect modification. , 2007, American journal of epidemiology.

[29]  J. Robins Addendum to “a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect” , 1987 .

[30]  Samy Suissa,et al.  Immortal time bias in pharmaco-epidemiology. , 2008, American journal of epidemiology.

[31]  James M Robins,et al.  The International Journal of Biostatistics CAUSAL INFERENCE Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes , Part II : Proofs of Results , 2011 .

[32]  Miguel A Hernán,et al.  With great data comes great responsibility: publishing comparative effectiveness research in epidemiology. , 2011, Epidemiology.

[33]  M. Hernán,et al.  Recombinant hepatitis B vaccine and the risk of multiple sclerosis : A prospective study. Authors' reply , 2005 .

[34]  P. Rosenbaum Choice as an Alternative to Control in Observational Studies , 1999 .

[35]  M. Hernán,et al.  Beyond the intention-to-treat in comparative effectiveness research , 2012, Clinical trials.

[36]  W. Richardson,et al.  The well-built clinical question: a key to evidence-based decisions. , 1995, ACP journal club.

[37]  J. Robins,et al.  Observational Studies Analyzed Like Randomized Trials and Vice Versa , 2017 .

[38]  Elias Bareinboim,et al.  External Validity: From Do-Calculus to Transportability Across Populations , 2014, Probabilistic and Causal Inference.

[39]  J. Robins,et al.  Authorsʼ Response, Part I: Observational Studies Analyzed Like Randomized Experiments: Best of Both Worlds , 2008 .

[40]  Robert T. Chen,et al.  Recombinant hepatitis B vaccine and the risk of multiple sclerosis: a prospective study. , 2005, Neurology.

[41]  Hsueh-Fen Chen,et al.  Does the Patient's Payer Matter in Hospital Patient Safety?: A Study of Urban Hospitals , 2007, Medical care.

[42]  D. Moher,et al.  CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials , 2010, Journal of pharmacology & pharmacotherapeutics.

[43]  M. J. van der Laan,et al.  The International Journal of Biostatistics Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules , 2011 .

[44]  J. Robins,et al.  The International Journal of Biostatistics CAUSAL INFERENCE Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes , Part I : Main Content , 2011 .

[45]  R. Little,et al.  The prevention and treatment of missing data in clinical trials. , 2012, The New England journal of medicine.

[46]  Heidi Jiao,et al.  Ethical and scientific issues in studying the safety of approved drugs , 2012 .

[47]  Michael E. Sobel,et al.  What Do Randomized Studies of Housing Mobility Demonstrate? , 2006 .

[48]  M. Hernán,et al.  Can big data tell us what clinical trials don't? Screening colonoscopy to prevent colorectal cancer in individuals aged 70-79 years. , 2016 .

[49]  D. Madigan,et al.  A Systematic Statistical Approach to Evaluating Evidence from Observational Studies , 2014 .

[50]  An Pan,et al.  Hypothetical Midlife Interventions in Women and Risk of Type 2 Diabetes , 2013, Epidemiology.

[51]  J. Lellouch,et al.  Explanatory and pragmatic attitudes in therapeutical trials. , 1967, Journal of chronic diseases.

[52]  James M. Robins,et al.  Invited Commentary Invited Commentary: Effect Modification by Time-varying Covariates American Journal of Epidemiology Advance Access Standard versus History-adjusted Marginal Structural Models Model Incompatibility in History-adjusted Marginal Structural Models Structural Nested Models versus Histo , 2006 .

[53]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[54]  Mark J van der Laan,et al.  The International Journal of Biostatistics A Targeted Maximum Likelihood Estimator for Two-Stage Designs , 2011 .

[55]  Roger Logan,et al.  Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease , 2013, Statistical methods in medical research.

[56]  Judea Pearl,et al.  Causal Inference , 2010 .

[57]  J. Robins,et al.  Effect modification by time-varying covariates. , 2007, American journal of epidemiology.

[58]  L H Kuller,et al.  Surveillance and ascertainment of cardiovascular events. The Cardiovascular Health Study. , 1995, Annals of epidemiology.

[59]  Luis Alberto García Rodríguez,et al.  Case validation in research using large databases. , 2010, The British journal of general practice : the journal of the Royal College of General Practitioners.