Instrument Space Selection for Kernel Maximum Moment Restriction

Kernel maximum moment restriction (KMMR) recently emerges as a popular framework for instrumental variable (IV) based conditional moment restriction (CMR) models with important applications in conditional moment (CM) testing and parameter estimation for IV regression and proximal causal learning. The effectiveness of this framework, however, depends critically on the choice of a reproducing kernel Hilbert space (RKHS) chosen as a space of instruments. In this work, we presents a systematic way to select the instrument space for parameter estimation based on a principle of the least identifiable instrument space (LIIS) that identifies model parameters with the least space complexity. Our selection criterion combines two distinct objectives to determine such an optimal space: (i) a test criterion to check identifiability; (ii) an information criterion based on the effective dimension of RKHSs as a complexity measure. We analyze the consistency of our method in determining the LIIS, and demonstrate its effectiveness for parameter estimation via simulations.

[1]  F. Windmeijer,et al.  Finite Sample Inference for GMM Estimators in Linear Panel Data Models , 2002 .

[2]  J. Geanakoplos,et al.  Generalized Instrumental Variables Estimation of Nonlinear Rational Expectations Models , 2007 .

[3]  Andrew Bennett,et al.  Deep Generalized Method of Moments for Instrumental Variable Analysis , 2019, NeurIPS.

[4]  Tong Zhang,et al.  Effective Dimension and Generalization of Kernel Learning , 2002, NIPS.

[5]  W. Newey,et al.  16 Efficient estimation of models with conditional moment restrictions , 1993 .

[6]  Krikamol Muandet,et al.  Kernel Conditional Moment Test via Maximum Moment Restriction , 2020, UAI.

[7]  Kevin Leyton-Brown,et al.  Deep IV: A Flexible Approach for Counterfactual Prediction , 2017, ICML.

[8]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[9]  J. Robin,et al.  TESTS OF RANK , 2000, Econometric Theory.

[10]  Martin J. Wainwright,et al.  A More Powerful Two-Sample Test in High Dimensions using Random Projection , 2011, NIPS.

[11]  Stephen G. Donald,et al.  Choosing the Number of Instruments , 2001 .

[12]  Vitalii P. Tanana,et al.  Theory of Linear Ill-Posed Problems and its Applications , 2002 .

[13]  H. Akaike A new look at the statistical model identification , 1974 .

[14]  Ignacio N. Lobato,et al.  Consistent Estimation of Models Defined by Conditional Moment Restrictions , 2004 .

[15]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[16]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[17]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[18]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[19]  Luofeng Liao,et al.  Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning , 2021, ArXiv.

[20]  Qiang Liu,et al.  Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.

[21]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[22]  J. Florens,et al.  GENERALIZATION OF GMM TO A CONTINUUM OF MOMENT CONDITIONS , 2000, Econometric Theory.

[23]  Marine Carrasco,et al.  A regularization approach to the many instruments problem , 2012 .

[24]  Luofeng Liao,et al.  Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach , 2020, NeurIPS.

[25]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[26]  K. Morimune Approximate Distributions of k-Class Estimators when the Degree of Overidentifiability is Large Compared with the Sample Size , 1983 .

[27]  Guido W. Imbens,et al.  Empirical likelihood estimation and consistent tests with conditional moment restrictions , 2003 .

[28]  Kenji Fukumizu,et al.  A Linear-Time Kernel Goodness-of-Fit Test , 2017, NIPS.

[29]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[30]  A. Hall,et al.  Econometricians Have Their Moments: GMM at 32 , 2015 .

[31]  Donald W. K. Andrews,et al.  Consistent Moment Selection Procedures for Generalized Method of Moments Estimation , 1999 .

[32]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[33]  Alastair R. Hall,et al.  Generalized Method of Moments , 2005 .

[34]  G. Micula,et al.  Numerical Treatment of the Integral Equations , 1999 .

[35]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[36]  Vasilis Syrgkanis,et al.  Adversarial Generalized Method of Moments , 2018, ArXiv.

[37]  J. Florens,et al.  Linear Inverse Problems in Structural Econometrics Estimation Based on Spectral Decomposition and Regularization , 2003 .

[38]  James R. Staley,et al.  A robust and efficient method for Mendelian randomization with hundreds of genetic variants , 2020, Nature Communications.

[39]  Nishanth Dikkala,et al.  Minimax Estimation of Conditional Moment Models , 2020, NeurIPS.

[40]  Krikamol Muandet,et al.  Dual Instrumental Variable Regression , 2020, NeurIPS.

[41]  Arthur Gretton,et al.  Kernel Instrumental Variable Regression , 2019, NeurIPS.

[42]  Krikamol Muandet,et al.  Maximum Moment Restriction for Instrumental Variable Regression , 2020, ArXiv.

[43]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[44]  M. C. Jones,et al.  On optimal data-based bandwidth selection in kernel density estimation , 1991 .

[45]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[46]  Qiang Liu,et al.  A Kernel Loss for Solving the Bellman Equation , 2019, NeurIPS.

[47]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[48]  Xiaohong Chen,et al.  Efficient Estimation of Models with Conditional Moment Restrictions Containing Unknown Functions , 2003 .

[49]  Masatoshi Uehara,et al.  Minimax Weight and Q-Function Learning for Off-Policy Evaluation , 2019, ICML.

[50]  Shahar Mendelson,et al.  On the Performance of Kernel Classes , 2003, J. Mach. Learn. Res..

[51]  Bent E. Sørensen,et al.  GMM Estimation of a Stochastic Volatility Model: A Monte Carlo Study , 1996 .

[52]  Peter Schmidt,et al.  Redundancy of moment conditions , 1999 .

[53]  Kevin Leyton-Brown,et al.  Valid Causal Inference with (Some) Invalid Instruments , 2020, ICML.

[54]  Dylan S. Small,et al.  A review of instrumental variable estimators for Mendelian randomization , 2015, Statistical methods in medical research.

[55]  Whitney K. Newey,et al.  Higher Order Properties of Gmm and Generalized Empirical Likelihood Estimators , 2003 .

[56]  Jayanta K. Ghosh,et al.  Higher Order Asymptotics , 1994 .

[57]  J. Muth Rational Expectations and the Theory of Price Movements , 1961 .

[58]  Zhipeng Liao,et al.  Select the Valid and Relevant Moments: An Information-Based LASSO for GMM with Many Moments , 2013 .

[59]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[60]  Alastair R. Hall,et al.  Information in generalized method of moments estimation and entropy-based moment selection , 2007 .

[61]  Nathan Kallus,et al.  Generalized Optimal Matching Methods for Causal Inference , 2016, J. Mach. Learn. Res..

[62]  A. L. Nagar The Bias and Moment Matrix of the General k-Class Estimators of the Parameters in Simultaneous Equations , 1959 .

[63]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[64]  Robert M. de Jong,et al.  THE BIERENS TEST UNDER DATA DEPENDENCE , 1996 .

[65]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[66]  Jean-Pierre Florens,et al.  ON THE ASYMPTOTIC EFFICIENCY OF GMM , 2013, Econometric Theory.

[67]  Arthur Gretton,et al.  An Adaptive Test of Independence with Analytic Kernel Embeddings , 2016, ICML.

[68]  Stephen G. Donald,et al.  Choosing instrumental variables in conditional moment restriction models , 2009 .

[69]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[70]  Christopher R'e,et al.  Ivy: Instrumental Variable Synthesis for Causal Inference , 2020, AISTATS.

[71]  Arthur Gretton,et al.  Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction , 2021, ICML.