RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests

Many causal and policy effects of interest are defined by linear functionals of highdimensional or non-parametric regression functions. √ n-consistent and asymptotically normal estimation of the object of interest requires debiasing to reduce the effects of regularization and/or model selection on the object of interest. Debiasing is typically achieved by adding a correction term to the plug-in estimator of the functional, that is derived based on a functional-specific theoretical derivation of what is known as the influence function and which leads to properties such as double robustness and Neyman orthogonality. We instead implement an automatic debiasing procedure based on automatically learning the Riesz representation of the linear functional using Neural Nets and Random Forests. Our method solely requires value query oracle access to the linear functional. We propose a multi-tasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. We also propose a Random Forest method which learns a locally linear representation of the Riesz function. Even though our methodology applies to arbitrary functionals, we experimentally find that it beats state of the art performance of the prior neural net based estimator of Shi et al. (2019) for the case of the average treatment effect functional. We also evaluate our method on the more challenging problem of estimating average marginal effects with continuous treatments, using semi-synthetic data of gasoline price changes on gasoline demand.

[1]  David M. Blei,et al.  Adapting Neural Networks for the Estimation of Treatment Effects , 2019, NeurIPS.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[4]  Whitney K. Newey,et al.  Cross-fitting and fast remainder rates for semiparametric estimation , 2017, 1801.09138.

[5]  Marco Carone,et al.  Toward Computerized Efficient Estimation in Infinite-Dimensional Models , 2016, Journal of the American Statistical Association.

[6]  Trevor Hastie,et al.  Causal Interpretations of Black-Box Models , 2019, Journal of business & economic statistics : a publication of the American Statistical Association.

[7]  David Duvenaud,et al.  Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.

[8]  Vasilis Syrgkanis,et al.  Automatic Debiased Machine Learning via Neural Nets for Generalized Linear Regression , 2021, 2104.14737.

[9]  Victor Chernozhukov,et al.  Automatic Debiased Machine Learning of Causal and Structural Effects , 2018 .

[10]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[11]  G. A. Young,et al.  High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.

[12]  M. J. van der Laan,et al.  Double Robust Efficient Estimators of Longitudinal Treatment Effects: Comparative Performance in Simulations and a Case Study , 2019, The international journal of biostatistics.

[13]  J. Horowitz,et al.  Measuring the price responsiveness of gasoline demand: Economic shape restrictions and nonparametric demand estimation , 2011 .

[14]  Stefan Wager,et al.  Policy Learning With Observational Data , 2017, Econometrica.

[15]  Vasilis Syrgkanis,et al.  Adversarial Estimation of Riesz Representers , 2020, ArXiv.

[16]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[17]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[18]  J. Horowitz,et al.  Nonparametric Estimation of a Nonseparable Demand Function under the Slutsky Inequality Restriction , 2017, Review of Economics and Statistics.