Orthogonal Machine Learning for Demand Estimation: High Dimensional Causal Inference in Dynamic Panels

There has been growing interest in how economists can import machine learning tools designed for prediction to accelerate and automate the model selection process, while still retaining desirable inference properties for causal parameters. Focusing on partially linear models, we extend the Double ML framework to allow for (1) a number of treatments that may grow with the sample size and (2) the analysis of panel data under sequentially exogenous errors. Our low-dimensional treatment (LD) regime directly extends the work in [Chernozhukov et al., 2016], by showing that the coefficients from a second stage, ordinary least squares estimator attain root-n convergence and desired coverage even if the dimensionality of treatment is allowed to grow. In a high-dimensional sparse (HDS) regime, we show that second stage LASSO and debiased LASSO have asymptotic properties equivalent to oracle estimators with no upstream error. We argue that these advances make Double ML methods a desirable alternative for practitioners estimating short-term demand elasticities in non-contractual settings.

[1]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[2]  W. Newey,et al.  Double machine learning for treatment and causal parameters , 2016 .

[3]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[4]  Kengo Kato,et al.  Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , 2013 .

[5]  Victor Chernozhukov,et al.  Post-Selection Inference for Generalized Linear Models With Many Controls , 2013, 1304.3969.

[6]  Christian Hansen,et al.  Inference in High-Dimensional Panel Models With an Application to Gun Control , 2014, 1411.6507.

[7]  Victor Chernozhukov,et al.  Testing Many Moment Inequalities , 2013 .

[8]  Victor Chernozhukov,et al.  Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems , 2013, 1304.0282.

[9]  Amit Gandhi,et al.  Measuring Substitution Patterns in Differentiated-Products Industries , 2019 .

[10]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[11]  W. Wu,et al.  Gaussian Approximation for High Dimensional Time Series , 2015, 1508.07036.

[12]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Causal Parameters , 2016, 1608.00060.

[13]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[14]  D. McLeish Dependent Central Limit Theorems and Invariance Principles , 1974 .

[15]  Susan Athey,et al.  Beyond prediction: Using big data for policy problems , 2017, Science.

[16]  Peter E. Rossi,et al.  Why Don't Prices Rise During Periods of Peak Demand? Evidence from Scanner Data , 2002 .

[17]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[18]  Martin Spindler,et al.  High-Dimensional $L_2$Boosting: Rate of Convergence , 2016, 1602.08927.

[19]  M. Rudelson Random Vectors in the Isotropic Position , 1996, math/9608208.

[20]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[21]  Steven T. Berry,et al.  Automobile Prices in Market Equilibrium , 1995 .