Quasi-Bayesian Dual Instrumental Variable Regression

Recent years have witnessed an upsurge of interest in employing flexible machine learning models for instrumental variable (IV) regression, but the development of uncertainty quantification methodology is still lacking. In this work we present a novel quasi-Bayesian procedure for IV regression, building upon the recently developed kernelized IV models and the dual/minimax formulation of IV regression. We analyze the frequentist behavior of the proposed method, by establishing minimax optimal contraction rates in L2 and Sobolev norms, and discussing the frequentist validity of credible balls. We further derive a scalable inference algorithm which can be extended to work with wide neural network models. Empirical evaluation shows that our method produces informative uncertainty estimates on complex high-dimensional problems.

[1]  Frank Kleibergen,et al.  BAYESIAN SIMULTANEOUS EQUATIONS ANALYSIS USING REDUCED RANK STRUCTURES , 1998, Econometric Theory.

[2]  K SriperumbudurBharath,et al.  Universality, Characteristic Kernels and RKHS Embedding of Measures , 2011 .

[3]  Kengo Kato,et al.  Quasi-Bayesian analysis of nonparametric instrumental variables models , 2012, 1204.2108.

[4]  Debdeep Pati,et al.  Frequentist coverage and sup-norm convergence rate in Gaussian process regression , 2017, 1708.04753.

[5]  Tommi S. Jaakkola,et al.  Maximum Entropy Discrimination , 1999, NIPS.

[6]  R. Nickl,et al.  Mathematical Foundations of Infinite-Dimensional Statistical Models , 2015 .

[7]  Ingo Steinwart,et al.  Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs , 2012 .

[8]  A. W. Vaart,et al.  Frequentist coverage of adaptive nonparametric Bayesian credible sets , 2013, 1310.4489.

[9]  Tim Pearce,et al.  Uncertainty in Neural Networks: Approximately Bayesian Ensembling , 2018, AISTATS.

[10]  B. Knapik,et al.  A general approach to posterior contraction in nonparametric inverse problems , 2014, Bernoulli.

[11]  Jae-Young Kim,et al.  Limited information likelihood and Bayesian analysis , 2002 .

[12]  Jun Zhu,et al.  Scalable Quasi-Bayesian Inference for Instrumental Variable Regression , 2021, NeurIPS.

[13]  A. W. Vaart,et al.  Reproducing kernel Hilbert spaces of Gaussian priors , 2008, 0805.3252.

[14]  Nathaniel Eldredge,et al.  Analysis and Probability on Infinite-Dimensional Spaces , 2016, 1607.03591.

[15]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[16]  Judith Rousseau,et al.  Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator , 2015, 1504.04814.

[17]  Peter E. Rossi,et al.  A Non-Parametric Bayesian Approach to the Instrumental Variable Problem , 2006 .

[18]  Kevin Leyton-Brown,et al.  Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[19]  James G. MacKinnon,et al.  Wild Bootstrap Tests for IV Regression , 2010 .

[20]  Xiaohong Chen,et al.  Semi‐Nonparametric IV Estimation of Shape‐Invariant Engel Curves , 2003 .

[21]  Ingo Steinwart,et al.  Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms , 2017, J. Mach. Learn. Res..

[22]  Specification testing in nonparametric instrumental variable estimation , 2012 .

[23]  Andrew Bennett,et al.  The Variational Method of Moments , 2020, ArXiv.

[24]  Z. Geng,et al.  Identifying Causal Effects With Proxy Variables of an Unmeasured Confounder. , 2016, Biometrika.

[25]  N Segnan,et al.  Adjusting for non-compliance and contamination in randomized clinical trials. , 1997, Statistics in medicine.

[26]  俊一 甘利 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .

[27]  Tong Zhang From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.

[28]  Vasilis Syrgkanis,et al.  Adversarial Generalized Method of Moments , 2018, ArXiv.

[29]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[30]  Le Song,et al.  Learning from Conditional Distributions via Dual Embeddings , 2016, AISTATS.

[31]  Stig Larsson,et al.  Posterior Contraction Rates for the Bayesian Approach to Linear Ill-Posed Inverse Problems , 2012, 1203.5753.

[32]  Nishanth Dikkala,et al.  Minimax Estimation of Conditional Moment Models , 2020, NeurIPS.

[33]  Ingo Steinwart,et al.  Convergence Types and Rates in Generic Karhunen-Loève Expansions with Applications to Sample Path Properties , 2014, Potential Analysis.

[34]  Arnold Zellner Bayesian Method of Moments (BMOM) Analysis of Mean and Regression Models , 1996 .

[35]  Eric Zivot,et al.  Bayesian and Classical Approaches to Instrumental Variables Regression , 2003 .

[36]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[37]  Carmen Cadarso-Suárez,et al.  Bayesian Nonparametric Instrumental Variables Regression Based on Penalized Splines and Dirichlet Process Mixtures , 2014 .

[38]  Jaehoon Lee,et al.  Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[39]  Albin Cassirer,et al.  Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.

[40]  Jun Zhu,et al.  Maximum Entropy Discrimination Markov Networks , 2009, J. Mach. Learn. Res..

[41]  Xiaohong Chen,et al.  ON RATE OPTIMALITY FOR ILL-POSED INVERSE PROBLEMS IN ECONOMETRICS , 2007, Econometric Theory.

[42]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[43]  Sander Greenland,et al.  An introduction to instrumental variables for epidemiologists. , 2018, International journal of epidemiology.

[44]  Don R. Hush,et al.  Optimal Rates for Regularized Least Squares Regression , 2009, COLT.

[45]  Arthur Gretton,et al.  Kernel Instrumental Variable Regression , 2019, NeurIPS.

[46]  R. Strichartz Analysis of the Laplacian on the Complete Riemannian Manifold , 1983 .

[47]  Anna Simoni,et al.  Nonparametric Estimation of An Instrumental Regression: A Quasi-Bayesian Approach Based on Regularized Posterior , 2012 .

[48]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[49]  Marcelo J. Moreira,et al.  Bootstrap and Higher-Order Expansion Validity When Instruments May Be Weak , 2004 .

[50]  V. Chernozhukov,et al.  An MCMC Approach to Classical Estimation , 2002, 2301.07782.

[51]  Anna Simoni,et al.  Gaussian Processes and Bayesian Moment Estimation , 2016, Journal of Business & Economic Statistics.

[52]  A. V. D. Vaart,et al.  BAYESIAN INVERSE PROBLEMS WITH GAUSSIAN PRIORS , 2011, 1103.2692.

[53]  H. Triebel Theory Of Function Spaces , 1983 .

[54]  Van Der Vaart,et al.  Rates of contraction of posterior distributions based on Gaussian process priors , 2008 .

[55]  Nicholas G. Polson,et al.  Bayesian Instrumental Variables: Priors and Likelihoods , 2014 .

[56]  Miroslav Dudík,et al.  Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling , 2007, J. Mach. Learn. Res..

[57]  Wenxin Jiang,et al.  Posterior Consistency of Nonparametric Conditional Moment Restricted Models , 2010, 1105.4847.

[58]  W. Newey,et al.  Instrumental variable estimation of nonparametric models , 2003 .

[59]  Xiaohong Chen,et al.  Optimal Sup-Norm Rates and Uniform Inference on Nonlinear Functionals of Nonparametric IV Regression , 2015, 1508.03365.

[60]  Xiaohong Chen,et al.  Estimation of Nonparametric Conditional Moment Models with Possibly Nonsmooth Generalized Residuals , 2009 .

[61]  Krikamol Muandet,et al.  Maximum Moment Restriction for Instrumental Variable Regression , 2020, ArXiv.

[62]  Arthur Gretton,et al.  Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction , 2021, ICML.

[63]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[64]  Marc Peter Deisenroth,et al.  Matern Gaussian processes on Riemannian manifolds , 2020, NeurIPS.

[65]  J. Horowitz Applied Nonparametric Instrumental Variables Estimation , 2011 .

[66]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[67]  Jonathan H. Wright,et al.  A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments , 2002 .

[68]  Alfonso Flores-Lagunes,et al.  Finite sample evidence of IV estimators under weak instruments , 2007 .

[69]  Andrew Bennett,et al.  Deep Generalized Method of Moments for Instrumental Variable Analysis , 2019, NeurIPS.

[70]  L. Cavalier Nonparametric statistical inverse problems , 2008 .