Doubly robust semiparametric inference using regularized calibrated estimation with high-dimensional data

Consider semiparametric estimation where a doubly robust estimating function for a low-dimensional parameter is available, depending on two working models. With high-dimensional data, we develop regularized calibrated estimation as a general method for estimating the parameters in the two working models, such that valid Wald confidence intervals can be obtained for the parameter of interest under suitable sparsity conditions if either of the two working models is correctly specified. We propose a computationally tractable two-step algorithm and provide rigorous theoretical analysis which justifies sufficiently fast rates of convergence for the regularized calibrated estimators in spite of sequential construction and establishes a desired asymptotic expansion for the doubly robust estimator. As concrete examples, we discuss applications to partially linear, log-linear, and logistic models and estimation of average treatment effects. Numerical studies in the former three examples demonstrate superior performance of our method, compared with debiased Lasso.

[1]  Peter J. Bickel,et al.  INFERENCE FOR SEMIPARAMETRIC MODELS: SOME QUESTIONS AND AN ANSWER , 2001 .

[2]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[3]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[4]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[5]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[6]  Jinfang Wang,et al.  Eliminating Multiple Root Problems In Estimation , 2016 .

[7]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[8]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[9]  Eric J Tchetgen Tchetgen,et al.  On doubly robust estimation in a semiparametric odds ratio model. , 2010, Biometrika.

[10]  Stijn Vansteelandt,et al.  Bias-Reduced Doubly Robust Estimation , 2015 .

[11]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[12]  Hua Yun Chen A Semiparametric Odds Ratio Model for Measuring Association , 2007, Biometrics.

[13]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[14]  Han Liu,et al.  A Unified Theory of Confidence Regions and Testing for High-Dimensional Estimating Equations , 2015, Statistical Science.

[15]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[16]  Ilya Shpitser,et al.  Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness, and sensitivity analysis. , 2012, Annals of statistics.

[17]  Zhiqiang Tan,et al.  Regression and Weighting Methods for Causal Inference Using Instrumental Variables , 2006 .

[18]  Honest data-adaptive inference for the average treatment effect under model misspecification using penalised bias-reduced double-robust estimation , 2017, 1708.03787.

[19]  James M. Robins,et al.  DOUBLY ROBUST INSTRUMENTAL VARIABLE REGRESSION , 2012 .

[20]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[21]  Gareth M. James,et al.  Improved variable selection with Forward-Lasso adaptive shrinkage , 2011, 1104.3390.

[22]  Zhiqiang Tan,et al.  Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data , 2018, The Annals of Statistics.

[23]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[24]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[25]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[26]  Zhiqiang Tan,et al.  Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data , 2017, Biometrika.

[27]  M. Farrell Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations , 2013, 1309.4686.

[28]  Sara van de Geer,et al.  High-dimensional inference in misspecified linear models , 2015, 1503.06426.

[29]  Zhiqiang Tan On doubly robust estimation for logistic partially linear models , 2019 .

[30]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[31]  Yang Ning,et al.  Robust Estimation of Causal Effects via High-Dimensional Covariate Balancing Propensity Score. , 2018, 1812.08683.

[32]  Han Liu,et al.  A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models , 2014, 1412.8765.

[33]  T. Ferguson A Course in Large Sample Theory , 1996 .

[34]  Liping Zhu,et al.  A Semiparametric Approach to Dimension Reduction , 2012, Journal of the American Statistical Association.

[35]  Zhiqiang Tan,et al.  A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .

[36]  David Haziza,et al.  Doubly robust inference with missing data in survey sampling , 2014 .

[37]  Stijn Vansteelandt,et al.  Inference for treatment effect parameters in potentially misspecified high-dimensional models , 2020 .

[38]  Stijn Vansteelandt,et al.  Doubly robust tests of exposure effects under high-dimensional confounding. , 2020, Biometrics.

[39]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[40]  D. O. Scharfstein Adjusting for nonignorable dropout using semiparametric nonresponse models (with discussion) , 1999 .

[41]  Charles F. Manski,et al.  Analog estimation methods in econometrics , 1988 .

[42]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[43]  James M. Robins,et al.  A unifying approach for doubly-robust $\ell_1$ regularized estimation of causal contrasts , 2019, 1904.03737.

[44]  W. Newey,et al.  Minimax Semiparametric Learning With Approximate Sparsity , 2019, 1912.12213.