High-dimensional instrumental variables regression and confidence sets -- v2/2012

We propose an instrumental variables method for inference in high-dimensional structural equations with endogenous regressors. The number of regressors K can be much larger than the sample size. A key ingredient is sparsity, i.e., the vector of coefficients has many zeros, or approximate sparsity, i.e., it is well approximated by a vector with many zeros. We can have less instruments than regressors and allow for partial identification. Our procedure, called STIV (Self Tuning Instrumental Variables) estimator, is realized as a solution of a conic program. The joint confidence sets can be obtained by solving K convex programs. We provide rates of convergence, model selection results and propose three types of joint confidence sets relying each on different assumptions on the parameter space. Under the stronger assumption they are adaptive. The results are uniform over a wide classes of distributions of the data and can have finite sample validity. When the number of instruments is too large or when one only has instruments for an endogenous regressor which are too weak, the confidence sets can have infinite volume with positive probability. This provides a simple one-stage procedure for inference robust to weak instruments which could also be used for low dimensional models. In our IV regression setting, the standard tools from the literature on sparsity, such as the restricted eigenvalue assumption are inapplicable. Therefore we develop new sharper sensitivity characteristics, as well as easy to compute data-driven bounds. All results apply to the particular case of the usual high-dimensional regression. We also present extensions to the high-dimensional framework of the two-stage least squares method and method to detect endogenous instruments given a set of exogenous instruments.

[1]  E. B. Wilson Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[2]  L. J. Savage,et al.  The nonexistence of certain statistical procedures in nonparametric problems , 1956 .

[3]  J. Sargan THE ESTIMATION OF ECONOMIC RELATIONSHIPS USING INSTRUMENTAL VARIABLES , 1958 .

[4]  R. L. Basmann On Finite Sample Distributions of Generalized Classical Linear Identifiability Test Statistics , 1960 .

[5]  B. Efron Student's t-Test under Symmetry Conditions , 1969 .

[6]  Takeshi Amemiya,et al.  The nonlinear two-stage least-squares estimator , 1974 .

[7]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[8]  Jiunn T. Hwang,et al.  The Nonexistence of 100$(1 - \alpha)$% Confidence Sets of Finite Expected Diameter in Errors-in-Variables and Related Models , 1987 .

[9]  G. Chamberlain Asymptotic efficiency in estimation with conditional moment restrictions , 1987 .

[10]  C. Nelson,et al.  The Distribution of the Instrumental Variables Estimator and its T-Ratiowhen the Instrument is a Poor One , 1988 .

[11]  Richard Startz,et al.  Some Further Results on the Exact Small Sample Properties of the Instrumental Variable Estimator , 1988 .

[12]  Richard Startz,et al.  The Distribution of the Instrumental Variables Estimator and its T-Ratiowhen the Instrument is a Poor One , 1988 .

[13]  Whitney K. Newey,et al.  EFFICIENT INSTRUMENTAL VARIABLES ESTIMATION OF NONLINEAR MODELS , 1990 .

[14]  J. Angrist,et al.  Does Compulsory School Attendance Affect Schooling and Earnings? , 1990 .

[15]  I. Pinelis Extremal Probabilistic Problems and Hotelling's $T^2$ Test Under a Symmetry Condition , 1994, math/0701806.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  X. Sala-i-Martin,et al.  I Just Ran Two Million Regressions , 1997 .

[18]  Jean-Marie Dufour,et al.  Some Impossibility Theorems in Econometrics with Applications to Structural and Dynamic Models , 1997 .

[19]  Joseph P. Romano Finite sample nonparametric inference and large sample efficiency , 1998 .

[20]  Donald W. K. Andrews,et al.  Consistent Moment Selection Procedures for Generalized Method of Moments Estimation , 1999 .

[21]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[22]  A. Hall,et al.  A Consistent Method for the Selection of Relevant Instruments , 2003 .

[23]  J. Florens,et al.  GENERALIZATION OF GMM TO A CONTINUUM OF MOMENT CONDITIONS , 2000, Econometric Theory.

[24]  Jinyong Hahn,et al.  A New Specification Test for the Validity of Instrumental Variables , 2000 .

[25]  Stephen G. Donald,et al.  Choosing the Number of Instruments , 2001 .

[26]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[27]  Donald W. K. Andrews,et al.  Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models , 2001 .

[28]  Jonathan H. Wright,et al.  A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments , 2002 .

[29]  J. Hahn OPTIMAL INFERENCE WITH MANY INSTRUMENTS , 2002, Econometric Theory.

[30]  Norman R. Swanson,et al.  Consistent Estimation with a Large Number of Weak Instruments , 2005 .

[31]  Bing-Yi Jing,et al.  Self-normalized Cramér-type large deviations for independent random variables , 2003 .

[32]  Xiaohong Chen,et al.  Semi‐Nonparametric IV Estimation of Shape‐Invariant Engel Curves , 2003 .

[33]  Guido W. Imbens,et al.  RANDOM EFFECTS ESTIMATORS WITH MANY INSTRUMENTAL VARIABLES , 2004 .

[34]  J. Stock,et al.  Inference with Weak Instruments , 2005 .

[35]  A. Owen A robust hybrid of lasso and ridge regression , 2006 .

[36]  Christian Hansen,et al.  Estimation with many instrumental variables , 2006 .

[37]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[38]  Bernard Fortin,et al.  Identification of Peer Effects through Social Networks , 2007, SSRN Electronic Journal.

[39]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[40]  A. Lewbel,et al.  Tricks With Hicks: The Easi Demand System , 2007 .

[41]  Karim Lounici Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators , 2008, 0801.4610.

[42]  Arnak S. Dalalyan,et al.  Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity , 2008, Machine Learning.

[43]  Marc Hoffmann,et al.  Nonlinear estimation for linear inverse problems with error in the operator , 2008, 0803.1956.

[44]  V. Koltchinskii The Dantzig selector and sparsity oracle inequalities , 2009, 0909.0861.

[45]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[46]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[47]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[48]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[49]  Mehmet Caner,et al.  LASSO-TYPE GMM ESTIMATOR , 2009, Econometric Theory.

[50]  Serena Ng,et al.  Selecting Instrumental Variables in a Data Rich Environment , 2009 .

[51]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[52]  Lutz Dümbgen,et al.  Nemirovski's Inequalities Revisited , 2008, Am. Math. Mon..

[53]  S. Geer,et al.  ℓ1-penalization for mixture regression models , 2010, 1202.6046.

[54]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[55]  A. Belloni,et al.  Post-l1-penalized estimators in high-dimensional linear regression models , 2010 .

[56]  A. Tsybakov,et al.  Exponential Screening and optimal rates of sparse estimation , 2010, 1003.2654.

[57]  N. Verzelen Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons , 2010, 1008.0526.

[58]  A. Tsybakov,et al.  Sparse recovery under matrix uncertainty , 2008, 0812.2818.

[59]  Cun-Hui Zhang,et al.  Rate Minimaxity of the Lasso and Dantzig Selector for the lq Loss in lr Balls , 2010, J. Mach. Learn. Res..

[60]  Victor Chernozhukov,et al.  High Dimensional Sparse Econometric Models: An Introduction , 2011, 1106.5242.

[61]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[62]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[63]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[64]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[65]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[66]  Raj Chetty,et al.  Identification and Inference With Many Invalid Instruments , 2011 .

[67]  Ryo Okui,et al.  Instrumental variable estimation in the presence of many moment conditions , 2011 .

[68]  E. Gautier,et al.  Adaptive estimation in the nonparametric random coefficients binary choice model by needlet thresholding , 2011, 1106.3503.

[69]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[70]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2011 .

[71]  Z. Liao Shrinkage Methods for Automated Econometric Model Determination , 2012 .

[72]  Norman R. Swanson,et al.  Instrumental Variable Estimation with Heteroskedasticity and Many Instruments , 2009 .

[73]  Marc Teboulle,et al.  Smoothing and First Order Methods: A Unified Framework , 2012, SIAM J. Optim..

[74]  Marine Carrasco,et al.  A regularization approach to the many instruments problem , 2012 .

[75]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[76]  Kengo Kato,et al.  Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , 2013 .

[77]  Xin Shen,et al.  Complementarity Formulations of ' 0 -norm Optimization Problems , 2013 .

[78]  Zhipeng Liao,et al.  Select the Valid and Relevant Moments: An Information-Based LASSO for GMM with Many Moments , 2013 .

[79]  Jean-Pierre Florens,et al.  ON THE ASYMPTOTIC EFFICIENCY OF GMM , 2013, Econometric Theory.

[80]  Zhipeng Liao,et al.  ADAPTIVE GMM SHRINKAGE ESTIMATION WITH CONSISTENT MOMENT SELECTION , 2012, Econometric Theory.