Targeted Undersmoothing

This paper proposes a post-model selection inference procedure, called targeted undersmoothing, designed to construct uniformly valid confidence sets for functionals of sparse high-dimensional models, including dense functionals that may depend on many or all elements of the high-dimensional parameter vector. The confidence sets are based on an initially selected model and two additional models which enlarge the initial model. By varying the enlargements of the initial model, one can also conduct sensitivity analysis of the strength of empirical conclusions to model selection mistakes in the initial model. We apply the procedure in two empirical examples: estimating heterogeneous treatment effects in a job training program and estimating profitability from an estimated mailing strategy in a marketing campaign. We also illustrate the procedure’s performance through simulation experiments.

[1]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[2]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[3]  Tong Zhang,et al.  On the Consistency of Feature Selection using Greedy Least Squares Regression , 2009, J. Mach. Learn. Res..

[4]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[5]  A. Tsybakov,et al.  Sparsity oracle inequalities for the Lasso , 2007, 0705.3308.

[6]  Stefan Wager,et al.  Efficient Policy Learning , 2017, ArXiv.

[7]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[8]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[9]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[10]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[11]  A. Belloni,et al.  Program evaluation and causal inference with high-dimensional data , 2013, 1311.2645.

[12]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[13]  Susan Athey,et al.  Recursive partitioning for heterogeneous causal effects , 2015, Proceedings of the National Academy of Sciences.

[14]  W. Newey,et al.  Double machine learning for treatment and causal parameters , 2016 .

[15]  T. Shakespeare,et al.  Observational Studies , 2003 .

[16]  Matias D. Cattaneo,et al.  Two-Step Estimation and Inference with Possibly Many Included Covariates , 2018, The Review of Economic Studies.

[17]  Christian Hansen,et al.  Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach , 2015 .

[18]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[19]  W. Newey,et al.  The asymptotic variance of semiparametric estimators , 1994 .

[20]  D. Kozbur Inference in Additively Separable Models With a High-Dimensional Set of Conditioning Variables , 2015, Journal of Business & Economic Statistics.

[21]  Kengo Kato,et al.  Central limit theorems and bootstrap in high dimensions , 2014, 1412.3661.

[22]  Qi Li,et al.  Nonparametric Econometrics: Theory and Practice , 2006 .

[23]  Stefan Wager,et al.  Estimating Average Treatment Effects: Supplementary Analyses and Remaining Challenges , 2017, 1702.01250.

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[26]  H. Leeb,et al.  CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS? , 2003, Econometric Theory.

[27]  C. Manski Partial Identification of Probability Distributions , 2003 .

[28]  A. Belloni,et al.  Program evaluation with high-dimensional data , 2013 .

[29]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[30]  Christian Hansen,et al.  Inference in High-Dimensional Panel Models With an Application to Gun Control , 2014, 1411.6507.

[31]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009 .

[32]  B. M. Pötscher,et al.  CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS? , 2007, Econometric Theory.

[33]  Yinchu Zhu,et al.  A projection pursuit framework for testing general high-dimensional hypothesis , 2017, 1705.01024.

[34]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[35]  Victor Chernozhukov,et al.  The sorted effects method: discovering heterogeneous effects beyond their averages , 2015 .

[36]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[37]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[38]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[39]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[40]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[41]  J. Angrist,et al.  Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings , 1999 .

[42]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[43]  Benedikt M. Potscher,et al.  Confidence Sets Based on Sparse Estimators Are Necessarily Large , 2007, 0711.1036.

[44]  Kengo Kato,et al.  Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , 2012, 1212.6906.

[45]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[46]  Yinchu Zhu,et al.  Linear Hypothesis Testing in Dense High-Dimensional Linear Models , 2016, Journal of the American Statistical Association.

[47]  Christian Hansen,et al.  Instrumental variables estimation with many weak instruments using regularized JIVE , 2014 .

[48]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[49]  Inference in Additively Separable Models with a High-Dimensional Set of Conditioning Variables , 2018 .

[50]  Andreas Krause,et al.  Learning Fourier Sparse Set Functions , 2012, AISTATS.

[51]  Soumendu Sundar Mukherjee,et al.  Weak convergence and empirical processes , 2019 .

[52]  M. J. van der Laan,et al.  STATISTICAL INFERENCE FOR THE MEAN OUTCOME UNDER A POSSIBLY NON-UNIQUE OPTIMAL TREATMENT STRATEGY. , 2016, Annals of statistics.

[53]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[54]  D. Kozbur Sharp Convergence Rates for Forward Regression in High-Dimensional Sparse Linear Models , 2017, 1702.01000.

[55]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[56]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[57]  M. Farrell Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations , 2013, 1309.4686.

[58]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[59]  Peter Bühlmann,et al.  High-dimensional simultaneous inference with the bootstrap , 2016, 1606.03940.

[60]  Hansheng Wang Forward Regression for Ultra-High Dimensional Variable Screening , 2009 .

[61]  T. Cai,et al.  Accuracy assessment for high-dimensional linear regression , 2016, The Annals of Statistics.

[62]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .