Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain

We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically normal when the first-stage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semi-parametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic "beta-min" conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark. In developing the IV results, we establish a series of new results for Lasso and Post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances which uses a data-weighted $\ell_1$-penalty function. Using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and Post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that $\log p = o(n^{1/3})$.

[1]  Ryo Okui,et al.  Instrumental variable estimation in the presence of many moment conditions , 2011 .

[2]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[3]  Daniel L. Chen,et al.  The Economic Impacts of Eminent Domain , 2011 .

[4]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[5]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[6]  Daniel L. Chen,et al.  Insiders and Outsiders: Does Forbidding Sexual Harassment Exacerbate Gender Inequality? , 2010 .

[7]  Andrew D. Martin,et al.  Untangling the Causal Effects of Sex on Judging , 2010 .

[8]  Serena Ng,et al.  INSTRUMENTAL VARIABLE ESTIMATION IN A DATA RICH ENVIRONMENT , 2010, Econometric Theory.

[9]  George Kapetanios,et al.  Factor-GMM Estimation with Large Sets of Possibly Weak Instruments , 2010, Comput. Stat. Data Anal..

[10]  Norman R. Swanson,et al.  Instrumental Variable Estimation with Heteroskedasticity and Many Instruments , 2009 .

[11]  Raman Uppal,et al.  A Generalized Approach to Portfolio Optimization: Improving Performance by Constraining Portfolio Norms , 2009, Manag. Sci..

[12]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[13]  Serena Ng,et al.  Selecting Instrumental Variables in a Data Rich Environment , 2009 .

[14]  Massimiliano Pontil,et al.  Taking Advantage of Sparsity in Multi-Task Learning , 2009, COLT.

[15]  V. Koltchinskii Sparsity in penalized empirical risk minimization , 2009 .

[16]  A. Tsybakov,et al.  Sparse recovery under matrix uncertainty , 2008, 0812.2818.

[17]  J. Bai,et al.  Forecasting economic time series using targeted predictors , 2008 .

[18]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[19]  M. Rudelson,et al.  On sparse reconstruction from Fourier and Gaussian measurements , 2008 .

[20]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[21]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[22]  Karim Lounici Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators , 2008, 0801.4610.

[23]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[24]  C. Hansen Asymptotic properties of a robust variance matrix estimator for panel data when T is large , 2007 .

[25]  Keith Knight,et al.  SHRINKAGE ESTIMATION FOR NEARLY SINGULAR DESIGNS , 2007, Econometric Theory.

[26]  A. Tsybakov,et al.  Aggregation for Gaussian regression , 2007, 0710.3654.

[27]  I. Daubechies,et al.  Sparse and stable Markowitz portfolios , 2007, Proceedings of the National Academy of Sciences.

[28]  Tom Y. Chang,et al.  Judge Specific Differences in Chapter 11 and Firm Outcomes , 2007 .

[29]  A. Tsybakov,et al.  Sparsity oracle inequalities for the Lasso , 2007, 0705.3308.

[30]  Christian Hansen,et al.  Estimation With Many Instrumental Variables , 2006, Journal of Business & Economic Statistics.

[31]  Florentina Bunea,et al.  Aggregation and sparsity via 1 penalized least squares , 2006 .

[32]  Donald W. K. Andrews,et al.  Optimal Two‐Sided Invariant Similar Tests for Instrumental Variables Regression , 2006 .

[33]  P. Bühlmann Boosting for high-dimensional linear models , 2006 .

[34]  Norman R. Swanson,et al.  Consistent Estimation with a Large Number of Weak Instruments , 2005 .

[35]  Frank Kleibergen,et al.  Testing Parameters in GMM without Assuming that they are identified , 2005 .

[36]  E. Candès,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[37]  Christian Hansen,et al.  The Reduced Form: A Simple Approach to Inference with Weak Instruments , 2005 .

[38]  Jinyong Hahn,et al.  Estimation with Weak Instruments: Accuracy of Higher-Order Bias and MSE Approximations , 2004 .

[39]  Bing-Yi Jing,et al.  Self-normalized Cramér-type large deviations for independent random variables , 2003 .

[40]  David Schkade,et al.  Ideological Voting on Federal Courts of Appeals: A Preliminary Investigation , 2003 .

[41]  Marcelo J. Moreira A Conditional Likelihood Ratio Test for Structural Models , 2003 .

[42]  G. Turnbull Land Development under the Threat of Taking , 2002 .

[43]  Jonathan H. Wright,et al.  A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments , 2002 .

[44]  J. Hahn OPTIMAL INFERENCE WITH MANY INSTRUMENTS , 2002, Econometric Theory.

[45]  Stephen G. Donald,et al.  Choosing the Number of Instruments , 2001 .

[46]  Robert Innes,et al.  Takings, Compensation, and Equal Treatment for Owners of Developed and Undeveloped Property1 , 1997, The Journal of Law and Economics.

[47]  W. Newey,et al.  Convergence rates and asymptotic normality for series estimators , 1997 .

[48]  Joshua D. Angrist,et al.  Split-Sample Instrumental Variables Estimates of the Return to Schooling , 1995 .

[49]  Thomas J. Miceli,et al.  Regulatory Takings: When Should Compensation Be Paid? , 1994, The Journal of Legal Studies.

[50]  Paul A. Bekker,et al.  ALTERNATIVE APPROXIMATIONS TO THE DISTRIBUTIONS OF INSTRUMENTAL VARIABLE ESTIMATORS , 1994 .

[51]  J. Stock,et al.  Instrumental Variables Regression with Weak Instruments , 1994 .

[52]  Whitney K. Newey,et al.  EFFICIENT INSTRUMENTAL VARIABLES ESTIMATION OF NONLINEAR MODELS , 1990 .

[53]  G. Chamberlain Asymptotic efficiency in estimation with conditional moment restrictions , 1987 .

[54]  Lawrence E. Blume,et al.  The Taking of Land: When Should Compensation Be Paid? , 1984 .

[55]  Wayne A. Fuller,et al.  Some Properties of a Modification of the Limited Information Estimator , 1977 .

[56]  T. W. Anderson,et al.  Estimation of the Parameters of a Single Equation in a Complete System of Stochastic Equations , 1949 .

[57]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[58]  V. Chernozhukov,et al.  Instrumental variable quantile regression: A robust inference approach , 2008 .

[59]  Guido W. Imbens,et al.  RANDOM EFFECTS ESTIMATORS WITH MANY INSTRUMENTAL VARIABLES , 2004 .

[60]  Timothy J. Riddiough The Economic Consequences of Regulatory Taking Risk on Land Value and Development Activity , 1997 .

[61]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[62]  T. Kloek,et al.  Simultaneous Equations Estimation Based on Principal Components of Predetermined Variables , 1960 .