A systematic review of causal methods enabling predictions under hypothetical interventions

Background: The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. For many applications this is perfectly acceptable. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. Aims: We aimed to identify and compare published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference. We aimed to identify the main methodological approaches, their underlying assumptions, targeted estimands, and possible sources of bias. Finally, we aimed to highlight unresolved methodological challenges. Methods: We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used to evaluate predictions under hypothetical interventions. We included both methodology development studies and applied studies. Results: We identified 4919 papers through database searches and a further 115 papers through manual searches. Of these, 87 papers were retained for full text screening, of which 12 were selected for inclusion. We found papers from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation.

[1]  Suchi Saria,et al.  Learning Treatment-Response Models from Multivariate Longitudinal Data , 2017, UAI.

[2]  C.J.H. Mann,et al.  Clinical Prediction Models: A Practical Approach to Development, Validation and Updating , 2009 .

[3]  Matthew Sperrin,et al.  Towards a Framework for the Design, Implementation and Reporting of Methodology Scoping Reviews , 2020, Journal of clinical epidemiology.

[4]  Miguel A. Hernán,et al.  Counterfactual prediction is not only for causal inference , 2020, European Journal of Epidemiology.

[5]  E. Arjas Time to Consider Time, and Time to Predict? , 2014 .

[6]  Linda M. Peelen,et al.  Accounting for treatment use when validating a prognostic model: a simulation study , 2017, BMC Medical Research Methodology.

[7]  P. Clare,et al.  Causal models adjusting for time-varying confounding-a systematic review of the literature. , 2018, International journal of epidemiology.

[8]  Lihui Zhao,et al.  A predictive enrichment procedure to identify potential responders to a new therapy for randomized, comparative controlled clinical studies. , 2016, Biometrics.

[9]  Daniel L. Oberski,et al.  Identification of predicted individual treatment effects in randomized clinical trials , 2018, Statistical methods in medical research.

[10]  Y Wang,et al.  Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials , 2005, The Lancet.

[11]  G. Collins,et al.  Counterfactual Clinical Prediction Models Could help to Infer Individualised Treatment Effects in Randomised Controlled Trials - an Illustration with the International Stroke Trial. , 2020, Journal of clinical epidemiology.

[12]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement , 2015, Systematic Reviews.

[13]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[14]  Mihaela van der Schaar,et al.  Validating Causal Inference Models via Influence Functions , 2019, ICML.

[15]  J. Robson,et al.  Lipid modification: cardiovascular risk assessment and the modification of blood lipids for the primary and secondary prevention of cardiovascular disease , 2007, Heart.

[16]  James M. Robins,et al.  Association, Causation, And Marginal Structural Models , 1999, Synthese.

[17]  James M Robins,et al.  Weight gain after smoking cessation and lifestyle strategies to reduce it. , 2019, Epidemiology.

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  A. Peters,et al.  Application of non-HDL cholesterol for population-based cardiovascular risk stratification: results from the Multinational Cardiovascular Risk Consortium , 2019, The Lancet.

[20]  Bryan Lim,et al.  Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks , 2018, NeurIPS.

[21]  Sally Morton,et al.  The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement: Explanation and Elaboration , 2019, Annals of Internal Medicine.

[22]  Kellyn F Arnold,et al.  Use of directed acyclic graphs (DAGs) in applied health research: review and recommendations , 2019 .

[23]  Suchi Saria,et al.  A Bayesian Nonparametic Approach for Estimating Individualized Treatment-Response Curves , 2016, ArXiv.

[24]  Sally Morton,et al.  The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement , 2019, Annals of Internal Medicine.

[25]  Goodarz Danaei,et al.  Hypothetical interventions to prevent stroke: an application of the parametric g-formula to a healthy middle-aged population , 2018, European Journal of Epidemiology.

[26]  John Hsu,et al.  A Second Chance to Get Causal Inference Right: A Classification of Data Science Tasks , 2018, CHANCE.

[27]  Tony Blakely,et al.  Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. , 2019, International journal of epidemiology.

[28]  Ioana Bica,et al.  From Real‐World Patient Data to Individualized Treatment Effects Using Machine Learning: Current and Future Methods to Address Underlying Challenges , 2020, Clinical pharmacology and therapeutics.

[29]  Sonja A. Swanson,et al.  Prediction meets causal inference: the role of treatment in clinical prediction models , 2020, European Journal of Epidemiology.

[30]  Mihaela van der Schaar,et al.  Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes , 2017, NIPS.

[31]  J. Hippisley-Cox,et al.  Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study , 2007, BMJ : British Medical Journal.

[32]  P. Spirtes,et al.  Review of Causal Discovery Methods Based on Graphical Models , 2019, Front. Genet..

[33]  Romin Pajouheshnia,et al.  Accounting for time‐dependent treatment use when developing a prognostic model from observational data: A review of methods , 2019, Statistica Neerlandica.

[34]  Alexander D'Amour,et al.  Flexible Sensitivity Analysis for Observational Studies Without Observable Implications , 2018, Journal of the American Statistical Association.

[35]  Glen P Martin,et al.  Using marginal structural models to adjust for treatment drop‐in when developing clinical prediction models , 2017, Statistics in medicine.

[36]  Tobias Kurth,et al.  Directed acyclic graphs and causal thinking in clinical risk prediction modeling , 2020, BMC Medical Research Methodology.

[37]  Jelena Savović,et al.  Application of causal inference methods in the analyses of randomised controlled trials: a systematic review , 2018, Trials.

[38]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[39]  M. J. van der Laan,et al.  The International Journal of Biostatistics Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules , 2011 .

[40]  S. Greenland,et al.  The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. , 2013, American journal of epidemiology.

[41]  Mihaela van der Schaar,et al.  Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations , 2020, ICLR.

[42]  P Glasziou,et al.  Cardiovascular risk scores do not account for the effect of treatment: a review , 2011, Heart.

[43]  J. Hippisley-Cox,et al.  Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study , 2017, British Medical Journal.

[44]  Ian O. Ellis,et al.  An updated PREDICT breast cancer prognostication and treatment benefit prediction model with independent validation , 2017, Breast Cancer Research.

[45]  Ricardo Silva,et al.  Observational-Interventional Priors for Dose-Response Learning , 2016, NIPS.

[46]  M. Leeflang,et al.  Search Filters for Finding Prognostic and Diagnostic Prediction Studies in Medline to Enhance Systematic Reviews , 2012, PloS one.

[47]  Wouter A. C. van Amsterdam,et al.  Eliminating biasing signals in lung cancer images for prognosis predictions with deep learning , 2019, npj Digital Medicine.

[48]  L. Tian,et al.  Analysis of randomized comparative clinical trial data for personalized treatment selections. , 2011, Biostatistics.

[49]  J. Ghosh Causality: Models, Reasoning and Inference, Second Edition by Judea Pearl , 2011 .

[50]  Fabrice Carrat,et al.  Performance of the marginal structural cox model for estimating individual and joined effects of treatments given in combination , 2017, BMC Medical Research Methodology.

[51]  Yoav Ben-Shlomo,et al.  A longitudinal model for disease progression was developed and applied to multiple sclerosis , 2015, Journal of Clinical Epidemiology.

[52]  Richard D Riley,et al.  Explicit inclusion of treatment in prognostic modeling was recommended in observational and randomized settings. , 2016, Journal of clinical epidemiology.

[53]  Vinny Davies,et al.  Reflection on modern methods: generalized linear models for prognosis and intervention—theory, practice and implications for machine learning , 2020, International journal of epidemiology.

[54]  Suchi Saria,et al.  Reliable Decision Support using Counterfactual Models , 2017, NIPS.