Variable Selection in Causal Inference using a Simultaneous Penalization Method

Abstract In the causal adjustment setting, variable selection techniques based only on the outcome or only on the treatment allocation model can result in the omission of confounders and hence may lead to bias, or the inclusion of spurious variables and hence cause variance inflation, in estimation of the treatment effect. We propose a variable selection method using a penalized objective function that is based on both the outcome and treatment assignment models. The proposed method facilitates confounder selection in high-dimensional settings. We show that under some mild conditions our method attains the oracle property. The selected variables are used to form a doubly robust regression estimator of the treatment effect. Using the proposed method we analyze a set of data on economic growth and study the effect of life expectancy as a measure of population health on the average growth rate of gross domestic product per capita.

[1]  Brian J Reich,et al.  Confounder selection via penalized credible regions , 2014, Biometrics.

[2]  Dylan S. Small,et al.  Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization , 2014, 1401.5755.

[3]  Han Liu,et al.  Some Two-Step Procedures for Variable Selection in High-Dimensional Linear Regression , 2008, 0810.1644.

[4]  Dylan S. Small,et al.  Robust confidence intervals for causal effects with possibly invalid instruments , 2015 .

[5]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[6]  J. Angrist,et al.  Two-Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity , 1995 .

[7]  Giovanni Parmigiani,et al.  Bayesian Effect Estimation Accounting for Adjustment Uncertainty , 2012, Biometrics.

[8]  H. Leeb,et al.  Sparse Estimators and the Oracle Property, or the Return of Hodges' Estimator , 2007, 0704.1466.

[9]  J. Pearl Invited commentary: understanding bias amplification. , 2011, American journal of epidemiology.

[10]  Melvyn Weeks,et al.  Robust Growth Determinants , 2011, SSRN Electronic Journal.

[11]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[12]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[13]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[14]  A. Raftery,et al.  Default Priors and Predictive Performance in Bayesian Model Averaging, with Application to Growth Determinants , 2007 .

[15]  Judea Pearl,et al.  On a Class of Bias-Amplifying Variables that Endanger Effect Estimates , 2010, UAI.

[16]  M. Steel,et al.  Comments on ‘Jointness of growth determinants’ , 2009 .

[17]  A. Tsiatis Semiparametric Theory and Missing Data , 2006 .

[18]  J. Robins,et al.  Estimating exposure effects by modelling the expectation of exposure conditional on confounders. , 1992, Biometrics.

[19]  X. Sala-i-Martin,et al.  Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (Bace) Approach , 2000 .

[20]  Eduardo Ley,et al.  On the Effect of Prior Assumptions in Bayesian Model Averaging With Applications to Growth Regression , 2007 .

[21]  M. J. van der Laan,et al.  The International Journal of Biostatistics Collaborative Double Robust Targeted Maximum Likelihood Estimation , 2011 .

[22]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[23]  Mark J. van der Laan,et al.  A semiparametric model selection criterion with applications to the marginal structural model , 2006, Comput. Stat. Data Anal..

[24]  M Alan Brookhart,et al.  The implications of propensity score variable selection strategies in pharmacoepidemiology: an empirical illustration , 2011, Pharmacoepidemiology and drug safety.

[25]  Ciprian M. Crainiceanu,et al.  Adjustment uncertainty in effect estimation , 2008 .

[26]  Jan R. Magnus,et al.  A comparison of two model averaging techniques with an application to growth empirics , 2010 .

[27]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[28]  Ashkan Ertefaie,et al.  Outcome‐adaptive lasso: Variable selection for causal inference , 2017, Biometrics.

[29]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[30]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[31]  Giovanni Parmigiani,et al.  Accounting for uncertainty in confounder and effect modifier selection when estimating average causal effects in generalized linear models , 2015, Biometrics.

[32]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[33]  M. Husain Alternative Estimates of the Effect of the Increase of Life Expectancy on Economic Growth , 2012 .

[34]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[35]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[36]  Corwin M Zigler,et al.  Model Feedback in Bayesian Propensity Score Estimation , 2013, Biometrics.

[37]  Paul R. Rosenbaum,et al.  Causal Inference in Randomized Experiments , 2010, Replication and Evidence Factors in Observational Studies.

[38]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[39]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[40]  A. Antoniadis Wavelets in statistics: A review , 1997 .

[41]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[42]  Marie Davidian,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.

[43]  Aad van der Vaart,et al.  The Cross-Validated Adaptive Epsilon-Net Estimator , 2006 .

[44]  A. Buja,et al.  Valid post-selection inference , 2013, 1306.1059.

[45]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[46]  Dennis L. Sun,et al.  Exact post-selection inference with the lasso , 2013 .

[47]  S. Lahiri,et al.  Bootstrapping Lasso Estimators , 2011 .

[48]  G. Chamberlain Asymptotic efficiency in estimation with conditional moment restrictions , 1987 .

[49]  Jonathan E. Taylor,et al.  Selective inference with a randomized response , 2015, 1507.06739.

[50]  M. Steel,et al.  Jointness in Bayesian Variable Selection with Applications to Growth Regression , 2006 .

[51]  Hongzhe Li,et al.  Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics , 2013, Journal of the American Statistical Association.

[52]  R. Tibshirani,et al.  Exact Post-selection Inference for Forward Stepwise and Least Angle Regression , 2014 .

[53]  Mark J. van der Laan,et al.  Why prefer double robust estimators in causal inference , 2005 .

[54]  S. Vansteelandt,et al.  On model selection and model misspecification in causal inference , 2012, Statistical methods in medical research.

[55]  M. Baiocchi,et al.  Instrumental variable methods for causal inference , 2014, Statistics in medicine.

[56]  S. Cole,et al.  Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies , 2009, Epidemiology.

[57]  G. Doppelhofer,et al.  Jointness of Growth Determinants , 2007, SSRN Electronic Journal.

[58]  Jonathan Taylor,et al.  Statistical learning and selective inference , 2015, Proceedings of the National Academy of Sciences.

[59]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[60]  T. Richardson,et al.  Covariate selection for the nonparametric estimation of an average treatment effect , 2011 .

[61]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[62]  Sander Greenland,et al.  Invited commentary: variable selection versus shrinkage in the control of multiple confounders. , 2007, American journal of epidemiology.

[63]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[64]  J. Schafer,et al.  Average causal effects from nonrandomized studies: a practical guide and simulated example. , 2008, Psychological methods.

[65]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[66]  Mark J van der Laan,et al.  Super Learning: An Application to the Prediction of HIV-1 Drug Resistance , 2007, Statistical applications in genetics and molecular biology.

[67]  J. Avorn,et al.  Variable selection for propensity score models. , 2006, American journal of epidemiology.

[68]  B. M. Pötscher,et al.  MODEL SELECTION AND INFERENCE: FACTS AND FICTION , 2005, Econometric Theory.

[69]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[70]  M. Davidian,et al.  Semiparametric Estimation of Treatment Effect in a Pretest-Posttest Study with Missing Data. , 2005, Statistical science : a review journal of the Institute of Mathematical Statistics.

[71]  Daron Acemoglu,et al.  Disease and Development: The Effect of Life Expectancy on Economic Growth , 2006, Journal of Political Economy.