论文信息 - Brownian Motions and Scrambled Wavelets for Least-Squares Regression

Brownian Motions and Scrambled Wavelets for Least-Squares Regression

We consider ordinary (non penalized) least-squares regression where the regression function is chosen in a randomly generated sub-space GP \subset S of finite dimension P, where S is a function space of infinite dimension, e.g. L2([0, 1]^d). GP is defined as the span of P random features that are linear combinations of the basis functions of S weighted by random Gaussian i.i.d. coefficients. We characterize the so-called kernel space K \subset S of the resulting Gaussian process and derive approximation error bounds of order O(||f||^2_K log(P)/P) for functions f \in K approximated in GP . We apply this result to derive excess risk bounds for the least-squares estimate in various spaces. For illustration, we consider regression using the so-called scrambled wavelets (i.e. random linear combinations of wavelets of L2([0, 1]^d)) and derive an excess risk rate O(||f*||_K(logN)/sqrt(N)) which is arbitrarily close to the minimax optimal rate (up to a logarithmic factor) for target functions f* in K = H^s([0, 1]^d), a Sobolev space of smoothness order s > d/2. We describe an efficient implementation using lazy expansions with numerical complexity ˜O(2dN^3/2 logN+N^5/2), where d is the dimension of the input data and N is the number of data.

R. Munos | Odalric-Ambrym Maillard

[1] A Tikhonov,et al. Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[2] Saburou Saitoh,et al. Theory of Reproducing Kernels and Its Applications , 1988 .

[3] G. Bourdaud. Ondelettes et espaces de Besov , 1995 .

[4] M. Lifshits. Gaussian Random Functions , 1995 .

[5] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[6] S. Mallat. A wavelet tour of signal processing , 1998 .

[7] Stéphane Jaffard,et al. Décompositions en Ondelettes , 2000 .

[8] S. Canu,et al. Functional learning through kernel , 2002 .

[9] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[10] S. Canu,et al. M L ] 6 O ct 2 00 9 Functional learning through kernel , 2009 .

[11] H. Bungartz,et al. Sparse grids , 2004, Acta Numerica.

[12] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[13] A. Rahimi,et al. Uniform approximation of functions with random bases , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[14] A. Barron,et al. Approximation and learning by greedy algorithms , 2008, 0803.1718.

[15] Michael Frazier,et al. Decomposition of Besov Spaces , 2009 .

[16] Rémi Munos,et al. Compressed Least-Squares Regression , 2009, NIPS.

[17] Winfried Sickel,et al. Tensor products of Sobolev-Besov spaces and applications to approximation from the hyperbolic cross , 2009, J. Approx. Theory.