Random rates in anisotropic regression

In the context of minimax theory, we propose a new kind of risk, normalized by a random variable, measurable with respect to the data. We present a notion of optimality and a method to construct optimal procedures accordingly. We apply this general setup to the problem of selecting significant variables in Gaussian white noise. In particular, we show that our method essentially improves the accuracy of estimation, in the sense of giving explicit improved confidence sets in L2-norm. Links to adaptive estimation are discussed. 1. Introduction. Searching for significant variables is certainly one of the oldest and most popular problems in statistics. One of the simplest models where the issue of selecting significant variables was first stated mathematically is linear regression. A vast literature has been devoted to this topic since and different approaches have been proposed over the last forty years, both for estimation and for hypothesis testing. Among many authors, we refer to Akaike [1], Breiman and Freedman [3], Chernoff [5], Csiszar and Korner [6], Dychakov [10], Patel [42], Renyi [46], Freidlina [13], Meshalkin [35], Malyutov and Tsitovich [34], Schwarz [47] and Stone [48]. In classical parametric regression, if we consider a linear model, we first have to measure the possible gain of “searching for a limited number of significant variables.” If the model comes from a specific field of application, then only an adequate description together with its solution is relevant. However, from a mathematical point of view, a theory of selecting significant variables does not lead—at least asymptotically—to a substantial improvement of the accuracy of estimation: in a regular parametric model, the classical √ n rate of convergence is not affected by the number of significant variables. (However, even in this setup, let us emphasize that “asymptotically” has to be understood as “up to a constant” and that the correct choice of significant variables may possibly improve this constant.) If instead of a linear model we consider a nonparametric regression model, the search for significant variables becomes crucial for estimating the regression function: the rate of convergence explicitly depends on the set of significant variables. Let us develop this statement with the following example of multivariate regression: suppose we observe Z (n) = (Xi ,Y i ,i = 1 ,... , n)in the model

[1]  A. Tsybakov,et al.  Sharp adaptation for inverse problems with random noise , 2002 .

[2]  O. Lepski,et al.  Adaptive non-parametric estimation of smooth multivariate functions , 1999 .

[3]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[4]  Lawrence D. Brown,et al.  Superefficiency in Nonparametric Function Estimation , 1997 .

[5]  A. Tsybakov,et al.  Oracle inequalities for inverse problems , 2002 .

[6]  P. Hall EFFECT OF BIAS ESTIMATION ON COVERAGE ACCURACY OF BOOTSTRAP CONFIDENCE INTERVALS FOR A PROBABILITY DENSITY , 1992 .

[7]  Ker-Chau Li,et al.  Honest Confidence Regions for Nonparametric Regression , 1989 .

[8]  G. Kerkyacharian,et al.  Nonlinear estimation in anisotropic multi-index denoising , 2001 .

[9]  Lianfen Qian,et al.  Nonparametric Curve Estimation: Methods, Theory, and Applications , 1999, Technometrics.

[10]  Alexander Goldenshluger,et al.  Adaptive Prediction and Estimation in Linear Regression with Infinitely Many Parameters , 2001 .

[11]  Julian J. Faraway Bootstrap selection of bandwidth and confidence bands for nonparametric regression , 1990 .

[12]  Michael H. Neumann Automatic bandwidth choice and confidence intervals in nonparametric regression , 1995 .

[13]  Rainer von Sachs,et al.  Wavelet thresholding in anisotropic function classes and application to adaptive estimation of evolutionary spectra , 1997 .

[14]  A. Tsybakov,et al.  Introduction à l'estimation non-paramétrique , 2003 .

[15]  J. Polzehl,et al.  Image denoising: Pointwise adaptive approach , 2003 .

[16]  Mark G. Low On nonparametric confidence intervals , 1997 .

[17]  P. Hall Edgeworth expansions for nonparametric density estimators, with applications , 1991 .

[18]  V. Spokoiny,et al.  Multiscale testing of qualitative hypotheses , 2001 .

[19]  D. Picard,et al.  Adaptive confidence interval for pointwise curve estimation , 2000 .

[20]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[21]  O. Lepskii On a Problem of Adaptive Estimation in Gaussian White Noise , 1991 .