Inference for High-Dimensional Exchangeable Arrays

We consider inference for high-dimensional exchangeable arrays where the dimension may be much larger than the cluster sizes. Specifically, we consider separately and jointly exchangeable arrays that correspond to multiway clustered and polyadic data, respectively. Such exchangeable arrays have seen a surge of applications in empirical economics. However, both exchangeability concepts induce highly complicated dependence structures, which poses a significant challenge for inference in high dimensions. In this paper, we first derive high-dimensional central limit theorems (CLTs) over the rectangles for the exchangeable arrays. Building on the high-dimensional CLTs, we develop novel multiplier bootstraps for the exchangeable arrays and derive their finite sample error bounds in high dimensions. The derivations of these theoretical results rely on new technical tools such as Hoeffding-type decomposition and maximal inequalities for the degenerate components in the Hoeffiding-type decomposition for the exchangeable arrays. We illustrate applications of our bootstrap methods to robust inference in demand analysis, robust inference in extended gravity analysis, and penalty choice for $\ell_1$-penalized regression under multiway cluster sampling.

[1]  Yoon-Jae Whang,et al.  Testing functional inequalities , 2011, 1208.2733.

[2]  Peter M. Aronow,et al.  Cluster–Robust Variance Estimation for Dyadic Data , 2013, Political Analysis.

[3]  Inference on Causal and Structural Parameters using Many Moment Inequalities , 2018, The Review of Economic Studies.

[4]  A. Shaikh,et al.  A Practical Method for Testing Many Moment Inequalities , 2019, SSRN Electronic Journal.

[5]  Steven T. Berry Estimating Discrete-Choice Models of Product Differentiation , 1994 .

[6]  Patrick J. Wolfe,et al.  Co-clustering separately exchangeable network data , 2012, ArXiv.

[7]  Yuya Sasaki,et al.  Lasso under Multi-way Clustering: Estimation and Post-selection Inference , 2019, 1905.02107.

[8]  Soumendu Sundar Mukherjee,et al.  Weak convergence and empirical processes , 2019 .

[9]  E. Giné,et al.  Decoupling: From Dependence to Independence , 1998 .

[10]  Kengo Kato,et al.  Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , 2013 .

[11]  Arun K. Kuchibhotla,et al.  High-dimensional CLT: Improvements, non-uniform extensions and large deviations , 2018 .

[12]  A. Owen,et al.  Bootstrapping data arrays of arbitrary order , 2011, 1106.2125.

[13]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[14]  Victor Chernozhukov,et al.  UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK. , 2015, Annals of statistics.

[15]  Estimating Demand for Differentiated Products , 1996 .

[16]  Zoubin Ghahramani,et al.  Random function priors for exchangeable arrays with applications to graphs and relational data , 2012, NIPS.

[17]  Yoon-Jae Whang,et al.  TESTING FOR A GENERAL CLASS OF FUNCTIONAL INEQUALITIES , 2013, Econometric Theory.

[18]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[19]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[20]  Max Tabord-Meehan,et al.  Inference With Dyadic Data: Asymptotic Behavior of the Dyadic-Robust t-Statistic , 2015, Journal of Business & Economic Statistics.

[21]  O. Kallenberg Probabilistic Symmetries and Invariance Principles , 2005 .

[23]  P. Diaconis,et al.  Graph limits and exchangeable random graphs , 2007, 0712.2749.

[24]  Kengo Kato,et al.  Uniform confidence bands in deconvolution with unknown error distribution , 2016, Journal of Econometrics.

[25]  W. Wu,et al.  Gaussian Approximation for High Dimensional Time Series , 2015, 1508.07036.

[26]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[27]  P. Bickel,et al.  On Some Global Measures of the Deviations of Density Function Estimates , 1973 .

[28]  Peter McCullagh,et al.  Resampling and exchangeable arrays , 2000 .

[29]  Testing functional inequalities , 2011 .

[30]  Xiaoxia Shi,et al.  Inference Based on Many Conditional Moment Inequalities , 2010 .

[31]  Douglas L. Miller,et al.  A Practitioner’s Guide to Cluster-Robust Inference , 2015, The Journal of Human Resources.

[32]  Guang Cheng,et al.  Gaussian approximation for high dimensional vector under physical dependence , 2018, Bernoulli.

[33]  X ZhengAlice,et al.  A Survey of Statistical Network Models , 2010 .

[34]  L. Fatibene,et al.  Extended Gravity , 2014, 1403.7036.

[35]  Kengo Kato,et al.  Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings , 2015, 1502.00352.

[36]  Kengo Kato,et al.  Detailed proof of Nazarov's inequality , 2017, 1711.10696.

[37]  W. Dempsey,et al.  Edge Exchangeable Models for Interaction Networks , 2018, Journal of the American Statistical Association.

[38]  Steven F. Arnold,et al.  Linear Models with Exchangeably Distributed Errors , 1979 .

[39]  Timothy B. Armstrong Weighted KS Statistics for Inference on Conditional Moment Inequalities , 2011, 1112.1023.

[40]  D. Aldous Representations for partially exchangeable arrays of random variables , 1981 .

[41]  Art B. Owen,et al.  THE PIGEONHOLE BOOTSTRAP , 2007, 0712.1111.

[42]  Xiaohui Chen Gaussian and bootstrap approximations for high-dimensional U-statistics and their applications , 2016, 1610.00032.

[43]  Xavier D'Haultfoeuille,et al.  Empirical process results for exchangeable arrays , 2019, The Annals of Statistics.

[44]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[45]  Xiaoxia Shi,et al.  Estimating Demand for Differentiated Products with Zeroes in Market Share Data , 2020, SSRN Electronic Journal.

[46]  J. MacKinnon How Cluster‐Robust Inference is Changing Applied Econometrics , 2019, Canadian Journal of Economics/Revue canadienne d'économique.

[47]  Timothy B. Armstrong,et al.  Multiscale Adaptive Inference on Conditional Moment Inequalities , 2012, 1212.5729.

[48]  B. Silverman,et al.  Limit theorems for dissociated random variables , 1976, Advances in Applied Probability.

[49]  Douglas L. Miller,et al.  Robust Inference for Dyadic Data , 2015 .

[50]  T. Mayer,et al.  Gravity Equations: Workhorse, Toolkit, and Cookbook , 2013 .

[51]  Daniel M. Roy,et al.  Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Steven T. Berry,et al.  Automobile Prices in Market Equilibrium , 1995 .

[53]  David Choi,et al.  Co-clustering of Nonsmooth Graphons , 2015, ArXiv.

[54]  S. Mendelson,et al.  Compressed sensing under weak moment assumptions , 2014, 1401.2188.

[55]  Sokbae Lee,et al.  Intersection bounds: estimation and inference , 2009, 0907.3503.

[56]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[57]  Matthew D. Webb,et al.  Wild Bootstrap and Asymptotic Inference With Multiway Clustering , 2019, Journal of Business & Economic Statistics.

[58]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[59]  G. Eagleson,et al.  Limit theorems for weakly exchangeable arrays , 1978, Mathematical Proceedings of the Cambridge Philosophical Society.

[60]  Konrad Menzel,et al.  Bootstrap with Clustering in Two or More Dimensions , 2017, 1703.03043.

[61]  Yuta Koike Gaussian approximation of maxima of Wiener functionals and its application to high-frequency data , 2017, The Annals of Statistics.

[62]  Yukun Ma,et al.  Multiway Cluster Robust Double/Debiased Machine Learning , 2019 .

[63]  Cun-Hui Zhang,et al.  Beyond Gaussian approximation: Bootstrap for maxima of sums of independent random vectors , 2017, The Annals of Statistics.

[64]  Bryan S. Graham,et al.  Minimax Risk and Uniform Convergence Rates for Nonparametric Dyadic Regression , 2020, SSRN Electronic Journal.

[65]  Gerda Claeskens,et al.  Bootstrap confidence bands for regression curves and their derivatives , 2003 .

[66]  D. Andrews,et al.  Cross-Section Regression with Common Shocks , 2003 .

[67]  Kengo Kato,et al.  Gaussian approximation of suprema of empirical processes , 2014 .

[68]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[69]  Kengo Kato,et al.  Comparison and anti-concentration bounds for maxima of Gaussian random vectors , 2013, 1301.4807.

[70]  D. Bowman,et al.  A Saturated Model for Analyzing Exchangeable Binary Data: Applications to Clinical and Developmental Toxicity Studies , 1995 .

[71]  Marcel Fafchamps,et al.  The formation of risk sharing networks , 2007 .

[72]  D. Chetverikov ADAPTIVE TESTS OF CONDITIONAL MOMENT INEQUALITIES , 2018, Econometric Theory.

[73]  L. Gulko Decoupling , 2002, Definitions.

[74]  Yuta Koike,et al.  High-dimensional central limit theorems by Stein’s method , 2020, The Annals of Applied Probability.

[75]  Bryan Graham The Econometric Analysis of Network Data , 2020 .

[76]  Bryan S. Graham,et al.  Network Data , 2019, Handbook of Econometrics.

[77]  Donald W. K. Andrews,et al.  Inference Based on Many Conditional Moment Inequalities , 2010 .

[78]  Kengo Kato,et al.  Central limit theorems and bootstrap in high dimensions , 2014, 1412.3661.

[79]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[80]  C. Giraud Introduction to High-Dimensional Statistics , 2014 .

[81]  Asymptotic results under multiway clustering , 2018, 1807.07925.

[82]  S. Athey,et al.  Investment and Market Dominance , 2001 .

[83]  E. Levina,et al.  Estimating network edge probabilities by neighborhood smoothing , 2015, 1509.08588.

[84]  Kengo Kato,et al.  Randomized incomplete $U$-statistics in high dimensions , 2017, The Annals of Statistics.

[85]  Kengo Kato,et al.  Anti-concentration and honest, adaptive confidence bands , 2013, 1303.7152.

[86]  M. Rudelson,et al.  On sparse reconstruction from Fourier and Gaussian measurements , 2008 .

[87]  Kengo Kato,et al.  Jackknife multiplier bootstrap: finite sample approximations to the U-process supremum with applications , 2017, Probability Theory and Related Fields.

[88]  Alan J. Lee,et al.  U-Statistics: Theory and Practice , 1990 .

[89]  Bryan S. Graham,et al.  Kernel density estimation for undirected dyadic data , 2019, Journal of Econometrics.

[90]  Konrad Menzel,et al.  Inference for Games with Many Players , 2016 .

[91]  Douglas L. Miller,et al.  Robust Inference With Multiway Clustering , 2011 .

[92]  D. Chetverikov Adaptive test of conditional moment inequalities , 2011, 1201.0167.

[93]  Kengo Kato,et al.  Inference on Causal and Structural Parameters using Many Moment Inequalities , 2013, The Review of Economic Studies.