Machine Learning for Partial Identification: Example of Bracketed Data

Partially identified models occur commonly in economic applications. A common problem in this literature is a regression problem with bracketed (interval-censored) outcome variable Y, which creates a set-identified parameter of interest. The recent studies have only considered finite-dimensional linear regression in such context. To incorporate more complex controls into the problem, we consider a partially linear projection of Y on the set functions that are linear in treatment/policy variables and nonlinear in the controls. We characterize the identified set for the linear component of this projection and propose an estimator of its support function. Our estimator converges at parametric rate and has asymptotic normality properties. It may be useful for labor economics applications that involve bracketed salaries and rich, high-dimensional demographic data about the subjects of the study.

[1]  Whitney K. Newey,et al.  Efficiency of weighted average derivative estimators and index models , 1993 .

[2]  Hiroaki Kaido ASYMPTOTICALLY EFFICIENT ESTIMATION OF WEIGHTED AVERAGE DERIVATIVES WITH AN INTERVAL CENSORED VARIABLE , 2013, Econometric Theory.

[3]  Christian Hansen,et al.  High-Dimensional Metrics , 2016 .

[4]  Francesca Molinari,et al.  Asymptotic Properties for a Class of Partially Identified Models , 2006 .

[5]  Francesca Molinari,et al.  Random Sets in Econometrics , 2018 .

[6]  D. Epple,et al.  Evaluating Education Programs That Have Lotteried Admission and Selective Attrition , 2014, Journal of Labor Economics.

[7]  J. Robins,et al.  Locally Robust Semiparametric Estimation , 2016, Econometrica.

[8]  P. Robinson ROOT-N-CONSISTENT SEMIPARAMETRIC REGRESSION , 1988 .

[9]  C. Manski,et al.  Inference on Regressions with Interval Data on a Regressor or Outcome , 2002 .

[10]  J. Powell,et al.  Least absolute deviations estimation for the censored regression model , 1984 .

[11]  V. Chernozhukov,et al.  Inference for best linear approximations to set identified functions , 2012, 1212.5627.

[12]  Christopher R. Walters,et al.  Free to Choose: Can School Choice Reduce Student Achievement? , 2015 .

[13]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[14]  Victor Chernozhukov,et al.  Post-Selection Inference for Generalized Linear Models With Many Controls , 2013, 1304.3969.

[15]  David S. Lee Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects , 2009 .

[16]  Xiaohong Chen,et al.  Sensitivity Analysis in Semiparametric Likelihood Models , 2011 .

[17]  E. Tamer,et al.  Market Structure and Multiple Equilibria in Airline Markets , 2009 .

[18]  A. Belloni,et al.  Program evaluation and causal inference with high-dimensional data , 2013, 1311.2645.

[19]  W. Newey,et al.  The asymptotic variance of semiparametric estimators , 1994 .

[20]  Joshua D. Angrist,et al.  Long-Term Educational Consequences of Secondary School Vouchers: Evidence from Administrative Records in Colombia , 2006 .

[21]  Thierry Magnac,et al.  Set Identified Linear Models , 2011 .

[22]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Causal Parameters , 2016, 1608.00060.

[23]  Prem S. Puri,et al.  On Optimal Asymptotic Tests of Composite Statistical Hypotheses , 1967 .

[24]  A two-stage procedure for partially identified models , 2014 .

[25]  Thomas M. Stoker,et al.  Investigating Smooth Multiple Regression by the Method of Average Derivatives , 2015 .

[26]  Martin Huber,et al.  Sharp IV Bounds on Average Treatment Effects on the Treated and Other Populations Under Endogeneity and Noncompliance , 2017 .