Slightly Conservative Bootstrap for Maxima of Sums.

We study the bootstrap for the maxima of the sums of independent random variables, a problem of high relevance to many applications in modern statistics. Since the consistency of bootstrap was justified by Gaussian approximation in Chernozhukov et al. (2013), quite a few attempts have been made to sharpen the error bound for bootstrap and reduce the sample size requirement for bootstrap consistency. In this paper, we show that the sample size requirement can be dramatically improved when we make the inference slightly conservative, that is, to inflate the bootstrap quantile $t_{\alpha}^*$ by a small fraction, e.g. by $1\%$ to $1.01\,t^*_\alpha$. This simple procedure yields error bounds for the coverage probability of conservative bootstrap at as fast a rate as $\sqrt{(\log p)/n}$ under suitable conditions, so that not only the sample size requirement can be reduced to $\log p \ll n$ but also the overall convergence rate is nearly parametric. Furthermore, we improve the error bound for the coverage probability of the standard non-conservative bootstrap to $[(\log (np))^3 (\log p)^2/n]^{1/4}$ under general assumptions on data. These results are established for the empirical bootstrap and the multiplier bootstrap with third moment match. An improved coherent Lindeberg interpolation method, originally proposed in Deng and Zhang (2017), is developed to derive sharper comparison bounds, especially for the maxima.

[1]  J. Lindeberg Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung , 1922 .

[2]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[3]  Changbao Wu,et al.  Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis , 1986 .

[4]  E. Mammen Bootstrap and Wild Bootstrap for High Dimensional Linear Models , 1993 .

[5]  F. Nazarov On the Maximal Perimeter of a Convex Set in $ ℝ n $$\mathbb{R}^n$ with Respect to a Gaussian Measure , 2003 .

[6]  Tiefeng Jiang,et al.  The asymptotic distributions of the largest entries of sample correlation matrices , 2004, math/0406184.

[7]  Ryan O'Donnell,et al.  Learning Geometric Concepts via Gaussian Surface Area , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[8]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[9]  Han Xiao,et al.  Asymptotic theory for maximum deviations of sample covariance matrix estimates , 2013 .

[10]  T. Cai,et al.  Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings , 2013 .

[11]  Kengo Kato,et al.  Central limit theorems and bootstrap in high dimensions , 2014, 1412.3661.

[12]  Yen-Chi Chen,et al.  Density Level Sets: Asymptotics, Inference, and Visualization , 2015, 1504.05438.

[13]  W. Wu,et al.  Gaussian Approximation for High Dimensional Time Series , 2015, 1508.07036.

[14]  Peter Bühlmann,et al.  High-dimensional simultaneous inference with the bootstrap , 2016, 1606.03940.

[15]  M. Zhilova Non-classical Berry-Esseen inequality and accuracy of the weighted bootstrap , 2016, 1611.02686.

[16]  Jianqing Fan,et al.  Guarding against Spurious Discoveries in High Dimensions , 2015, J. Mach. Learn. Res..

[17]  Guang Cheng,et al.  Simultaneous Inference for High-Dimensional Linear Models , 2016, 1603.01295.

[18]  Yuta Koike Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles , 2019, Japanese Journal of Statistics and Data Science.