论文信息 - Learning Deep Features in Instrumental Variable Regression - 字舞流文

Learning Deep Features in Instrumental Variable Regression

Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.

Nando de Freitas | Arnaud Doucet | Arthur Gretton | Liyuan Xu | Yutian Chen | Siddarth Srinivasan | A. Doucet | N. D. Freitas | A. Gretton | Yutian Chen | Siddarth Srinivasan | Liyuan Xu

[1] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[3] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.

[4] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[5] Christian Hansen,et al. Instrumental variables estimation with many weak instruments using regularized JIVE , 2014 .

[6] W. Newey,et al. Instrumental variable estimation of nonparametric models , 2003 .

[7] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[9] J. Stock,et al. Retrospectives Who Invented Instrumental Variable Regression , 2003 .

[10] Arthur Gretton,et al. Kernel Instrumental Variable Regression , 2019, NeurIPS.

[11] Jieping Ye,et al. Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation , 2019, KDD.

[12] Stefano V. Albrecht,et al. Stabilizing Generative Adversarial Networks: A Survey , 2019 .

[13] Xiaohong Chen,et al. Semi‐Nonparametric IV Estimation of Shape‐Invariant Engel Curves , 2003 .

[14] Krikamol Muandet,et al. Dual IV: A Single Stage Instrumental Variable Regression , 2019, ArXiv.

[15] Joshua D. Angrist,et al. Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[16] J. Angrist,et al. Jackknife Instrumental Variables Estimation , 1995 .

[17] L. Hansen. Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[18] Elias Bareinboim,et al. Causal Inference by Surrogate Experiments: z-Identifiability , 2012, UAI.

[19] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[20] G. Wahba,et al. Generalized Inverses in Reproducing Kernel Spaces: An Approach to Regularization of Linear Operator Equations , 1974 .

[21] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[22] Timothy M. Christensen,et al. Optimal sup-norm rates and uniform inference on nonlinear functionals of nonparametric IV regression: Nonlinear functionals of nonparametric IV , 2018 .

[23] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24] Xiaohong Chen,et al. Estimation of Nonparametric Conditional Moment Models with Possibly Nonsmooth Generalized Residuals , 2009 .

[25] Krikamol Muandet,et al. Dual Instrumental Variable Regression , 2020, NeurIPS.

[26] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.

[27] J. Florens,et al. Linear Inverse Problems in Structural Econometrics Estimation Based on Spectral Decomposition and Regularization , 2003 .

[28] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[29] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.

[30] Hoang Minh Le,et al. Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning , 2019, NeurIPS Datasets and Benchmarks.

[31] J. Florens,et al. Nonparametric Instrumental Regression , 2010 .

[32] Joshua D. Angrist,et al. Split-Sample Instrumental Variables Estimates of the Return to Schooling , 1995 .

[33] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.

[34] Joshua D. Angrist,et al. Identification of Causal Effects Using Instrumental Variables , 1993 .

[35] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[36] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.

[37] Joshua D. Angrist,et al. Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records , 1990 .

[38] Emma Brunskill,et al. Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding , 2020, NeurIPS.

[39] Kevin Leyton-Brown,et al. Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[40] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .

[41] Andrew Bennett,et al. Deep Generalized Method of Moments for Instrumental Variable Analysis , 2019, NeurIPS.

[42] J. Horowitz,et al. Measuring the price responsiveness of gasoline demand: Economic shape restrictions and nonparametric demand estimation , 2011 .

[43] Philip G. Wright,et al. The tariff on animal and vegetable oils , 1928 .