On the aggregation of published prognostic scores for causal inference in observational studies

As real world evidence on drug efficacy involves nonrandomized studies, statistical methods adjusting for confounding are needed. In this context, prognostic score (PGS) analysis has recently been proposed as a method for causal inference. It aims to restore balance across the different treatment groups by identifying subjects with a similar prognosis for a given reference exposure (“control”). This requires the development of a multivariable prognostic model in the control arm of the study sample, which is then extrapolated to the different treatment arms. Unfortunately, large cohorts for developing prognostic models are not always available. Prognostic models are therefore subject to a dilemma between overfitting and parsimony; the latter being prone to a violation of the assumption of no unmeasured confounders when important covariates are ignored. Although it is possible to limit overfitting by using penalization strategies, an alternative approach is to adopt evidence synthesis. Aggregating previously published prognostic models may improve the generalizability of PGS, while taking account of a large set of covariates—even when limited individual participant data are available. In this article, we extend a method for prediction model aggregation to PGS analysis in nonrandomized studies. We conduct extensive simulations to assess the validity of model aggregation, compared with other methods of PGS analysis for estimating marginal treatment effects. We show that aggregating existing PGS into a “meta‐score” is robust to misspecification, even when elementary scores wrongfully omit confounders or focus on different outcomes. We illustrate our methods in a setting of treatments for asthma.

[1]  M. J. Laan,et al.  Targeted Learning: Causal Inference for Observational and Experimental Data , 2011 .

[2]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[3]  Alan R. Ellis,et al.  Matching on the disease risk score in comparative effectiveness research of new treatments , 2015, Pharmacoepidemiology and drug safety.

[4]  M. Eisner,et al.  Severity of asthma score predicts clinical outcomes in patients with moderate to severe persistent asthma. , 2012, Chest.

[5]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[6]  Diana Petitti,et al.  Risk factors for asthma hospitalizations in a managed care organization: development of a clinical prediction rule. , 2003, The American journal of managed care.

[7]  T. VanderWeele,et al.  Interpretation of Subgroup Analyses in Randomized Trials: Heterogeneity Versus Secondary Interventions , 2011, Annals of Internal Medicine.

[8]  G. Collins,et al.  Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies: The CHARMS Checklist , 2014, PLoS medicine.

[9]  N. Black Why we need observational studies to evaluate the effectiveness of health care , 1996, BMJ.

[10]  A. Mebazaa,et al.  Propensity score estimators for the average treatment effect and the average treatment effect on the treated may yield very different estimates , 2016, Statistical methods in medical research.

[11]  Thomas P A Debray,et al.  The use of prognostic scores for causal inference with general treatment regimes , 2019, Statistics in medicine.

[12]  Yvonne Vergouwe,et al.  Estimates of absolute treatment benefit for individual patients required careful modeling of statistical interactions. , 2015, Journal of clinical epidemiology.

[13]  F. Tubach,et al.  Estimation of conditional and marginal odds ratios using the prognostic score , 2017, Statistics in medicine.

[14]  Karel Moons,et al.  PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration , 2019, Annals of Internal Medicine.

[15]  Karel G M Moons,et al.  Meta‐analysis and aggregation of multiple published prediction models , 2014, Statistics in medicine.

[16]  Alan R. Ellis,et al.  The “Dry-Run” Analysis: A Method for Evaluating Risk Scores for Confounding Control , 2017, American journal of epidemiology.

[17]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[18]  D. V. Lindley,et al.  Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[19]  Sebastian Schneeweiss,et al.  Role of disease risk scores in comparative effectiveness research with emerging therapies , 2012, Pharmacoepidemiology and drug safety.

[20]  Sebastian Schneeweiss,et al.  Comparison of high-dimensional confounder summary scores in comparative studies of newly marketed medications. , 2016, Journal of clinical epidemiology.

[21]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[22]  Mark J. van der Laan,et al.  Super Learner In Prediction , 2010 .

[23]  B. Hansen The prognostic analogue of the propensity score , 2008 .

[24]  P. Holland Statistics and Causal Inference , 1985 .

[25]  C.J.H. Mann,et al.  Clinical Prediction Models: A Practical Approach to Development, Validation and Updating , 2009 .

[26]  Gary S Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration , 2015, Annals of Internal Medicine.

[27]  P. Royston,et al.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice , 2009, BMJ : British Medical Journal.

[28]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[29]  Glen P Martin,et al.  A multiple‐model generalisation of updating clinical prediction models , 2017, Statistics in medicine.

[30]  W. Assendelft,et al.  Identifying patients at risk for severe exacerbations of asthma: development and external validation of a multivariable prediction model , 2016, Thorax.

[31]  Y Vergouwe,et al.  Updating methods improved the performance of a clinical prediction model in new patients. , 2008, Journal of clinical epidemiology.

[32]  Sebastian Schneeweiss,et al.  Dimension reduction and shrinkage methods for high dimensional disease risk scores in historical data , 2016, Emerging Themes in Epidemiology.

[33]  Judea Pearl,et al.  On the Consistency Rule in Causal Inference: Axiom, Definition, Assumption, or Theorem? , 2010, Epidemiology.

[34]  J. Gagne,et al.  Disease Risk Score (DRS) as a Confounder Summary Method: Systematic Review and Recommendations , 2013 .

[35]  A. Hoes,et al.  Confounding of subgroup analyses in randomized data. , 2009, Archives of internal medicine.

[36]  W. Ray,et al.  Performance of disease risk scores, propensity scores, and traditional multivariable outcome regression in the presence of multiple confounders. , 2011, American journal of epidemiology.

[37]  W. Assendelft,et al.  Symptom- and fraction of exhaled nitric oxide-driven strategies for asthma control: A cluster-randomized trial in primary care. , 2015, The Journal of allergy and clinical immunology.

[38]  S. Hahn,et al.  Methodological bias in cluster randomised trials , 2005, BMC medical research methodology.

[39]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[40]  L. Smeeth,et al.  Propensity score analysis with partially observed covariates: How should multiple imputation be used? , 2017, Statistical methods in medical research.

[41]  M. Woodward,et al.  Risk prediction models: II. External validation, model updating, and impact assessment , 2012, Heart.

[42]  D. Torgerson,et al.  Evidence for risk of bias in cluster randomised trials: review of recent trials published in three general medical journals , 2003, BMJ : British Medical Journal.

[43]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[44]  Richard D Riley,et al.  Explicit inclusion of treatment in prognostic modeling was recommended in observational and randomized settings. , 2016, Journal of clinical epidemiology.

[45]  P. Rosenbaum A Characterization of Optimal Designs for Observational Studies , 1991 .

[46]  S. Lemeshow,et al.  A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. , 1993, JAMA.

[47]  G. Collins,et al.  PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies , 2019, Annals of Internal Medicine.

[48]  Michael Schomaker,et al.  Bootstrap inference when using multiple imputation , 2016, Statistics in medicine.

[49]  Charles E McCulloch,et al.  Relaxing the rule of ten events per variable in logistic and Cox regression. , 2007, American journal of epidemiology.

[50]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .