Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data

Consider the problem of estimating average treatment effects when a large number of covariates are used to adjust for possible confounding through outcome regression and propensity score models. The conventional approach of model building and fitting iteratively can be difficult to implement, depending on ad hoc choices of what variables are included. In addition, uncertainty from the iterative process of model selection is complicated and often ignored in subsequent inference about treatment effects. We develop new methods and theory to obtain not only doubly robust point estimators for average treatment effects, which remain consistent if either the propensity score model or the outcome regression model is correctly specified, but also model-assisted confidence intervals, which are valid when the propensity score model is correctly specified but the outcome regression model may be misspecified. With a linear outcome model, the confidence intervals are doubly robust, that is, being also valid when the outcome model is correctly specified but the propensity score model may be misspecified. Our methods involve regularized calibrated estimators with Lasso penalties, but carefully chosen loss functions, for fitting propensity score and outcome regression models. We provide high-dimensional analysis to establish the desired properties of our methods under comparable conditions to previous results, which give valid confidence intervals when both the propensity score and outcome regression are correctly specified. We present a simulation study and an empirical application which confirm the advantages of the proposed methods compared with related methods based on regularized maximum likelihood estimation.

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  Cun-Hui Zhang,et al.  Confidence Intervals for Low-Dimensional Parameters With High-Dimensional Data , 2011 .

[3]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[4]  Zhiqiang Tan,et al.  Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data , 2017, Biometrika.

[5]  Charles F. Manski,et al.  Analog estimation methods in econometrics , 1988 .

[6]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[7]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[8]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[9]  Stijn Vansteelandt,et al.  Bias-Reduced Doubly Robust Estimation , 2015 .

[10]  A. Tsiatis Semiparametric Theory and Missing Data , 2006 .

[11]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[12]  William A. Knaus,et al.  The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. , 1996, Journal of the American Medical Association (JAMA).

[13]  G. Imbens,et al.  Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization , 2001, Health Services and Outcomes Research Methodology.

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[16]  Peter J. Bickel,et al.  INFERENCE FOR SEMIPARAMETRIC MODELS: SOME QUESTIONS AND AN ANSWER , 2001 .

[17]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[18]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[19]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[20]  David Haziza,et al.  Doubly robust inference with missing data in survey sampling , 2014 .

[21]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[22]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[23]  B. Graham,et al.  Inverse Probability Tilting for Moment Condition Models with Missing Data , 2008 .

[24]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[25]  Victor Chernozhukov,et al.  UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK. , 2015, Annals of statistics.

[26]  W. Newey,et al.  Convergence rates and asymptotic normality for series estimators , 1997 .

[27]  M. Farrell Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations , 2013, 1309.4686.

[28]  秀俊 松井,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2014 .

[29]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[30]  Jian Huang,et al.  Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications , 2011, J. Mach. Learn. Res..

[31]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[32]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[33]  K. C. G. Chan,et al.  Globally efficient non‐parametric inference of average treatment effects by empirical balancing calibration weighting , 2016, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[34]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[35]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[36]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[37]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[38]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[39]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[40]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[41]  James M. Robins,et al.  MINIMAX ESTIMATION OF A FUNCTIONAL ON A STRUCTURED , 2016 .

[42]  A. Belloni,et al.  Program evaluation and causal inference with high-dimensional data , 2013, 1311.2645.

[43]  Zhiqiang Tan,et al.  Comment: Understanding OR, PS and DR , 2007, 0804.2969.

[44]  Zhiqiang Tan,et al.  A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .

[45]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[46]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .