Error Estimates for Multivariate Regression on Discretized Function Spaces

In this paper, we will discuss the discretization error for the regression setting and derive error bounds relying on the approximation properties of the discretized space. Furthermore, we will point out how the sampling error and the discretization error interact and how they can be balanced appropriately. We will present two examples based on tensor product spaces (sparse grids, hyperbolic crosses) which provide a suitable approach in the case of large sample sets in moderate dimensions.

[1]  Michael Griebel,et al.  On the construction of sparse tensor product spaces , 2012, Math. Comput..

[2]  Michael Griebel,et al.  Multiscale Approximation and Reproducing Kernel Hilbert Space Methods , 2015, SIAM J. Numer. Anal..

[3]  Albert Cohen,et al.  On the Stability and Accuracy of Least Squares Approximations , 2011, Foundations of Computational Mathematics.

[4]  A. K. Pujari,et al.  Data Mining Techniques , 2006 .

[5]  V. Temlyakov Approximation in Learning Theory , 2008 .

[6]  H. Bungartz,et al.  Sparse grids , 2004, Acta Numerica.

[7]  S. Mahadevan,et al.  Learning Theory , 2001 .

[8]  R. DeVore,et al.  Universal Algorithms for Learning Theory. Part II: Piecewise Polynomial Functions , 2007 .

[9]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[10]  Yiming Ying,et al.  Learnability of Gaussians with Flexible Variances , 2007, J. Mach. Learn. Res..

[11]  Markus Hegland,et al.  A Finite Element Method for Density Estimation with Gaussian Process Priors , 2010, SIAM J. Numer. Anal..

[12]  Christian Feuersänger,et al.  Sparse grid methods for higher dimensional approximation , 2010 .

[13]  M. Kohler Inequalities for uniform deviations of averages from expectations with applications to nonparametric regression , 2000 .

[14]  Markus Hegland,et al.  An optimal order regularization method which does not use additional smoothness assumptions , 1992 .

[15]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[16]  Michael Griebel,et al.  Sparse grids for boundary integral equations , 1999, Numerische Mathematik.

[17]  Fabio Nobile,et al.  Analysis of Discrete $$L^2$$L2 Projection on Polynomial Spaces with Random Evaluations , 2014, Found. Comput. Math..

[18]  V. N. Temli︠a︡kov Approximation of periodic functions , 1993 .

[19]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[20]  M. Narasimha Murty,et al.  Data Mining Techniques , 2014 .

[21]  Zongben Xu,et al.  Estimation of learning rate of least square algorithm via Jackson operator , 2011, Neurocomputing.

[22]  Vladimir Temlyakov,et al.  The Entropy in Learning Theory. Error Estimates , 2007 .

[23]  George G. Lorentz,et al.  Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[24]  Markus Hegland,et al.  Maximum a posteriori density estimation and the sparse grid combination technique , 2013 .

[25]  Michael Griebel,et al.  An Adaptive Sparse Grid Approach for Time Series Prediction , 2012 .

[26]  Albert Cohen,et al.  Discrete least squares polynomial approximation with random evaluations − application to parametric and stochastic elliptic PDEs , 2015 .

[27]  T. Hastie,et al.  Principal Curves , 2007 .

[28]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[29]  Michael Griebel,et al.  Tensor product type subspace splittings and multilevel iterative methods for anisotropic problems , 1995, Adv. Comput. Math..

[30]  H. Triebel Interpolation Theory, Function Spaces, Differential Operators , 1978 .

[31]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[32]  Markus Hegland,et al.  Fitting multidimensional data using gradient penalties and the sparse grid combination technique , 2009, Computing.

[33]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[34]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[35]  Ding-Xuan Zhou,et al.  Learning with sample dependent hypothesis spaces , 2008, Comput. Math. Appl..

[36]  Wolfgang Dahmen,et al.  Universal Algorithms for Learning Theory Part I : Piecewise Constant Functions , 2005, J. Mach. Learn. Res..

[37]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[38]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[39]  Benjamin Peherstorfer,et al.  Spatially adaptive sparse grids for high-dimensional data-driven problems , 2010, J. Complex..