Learning and approximation by Gaussians on Riemannian manifolds

Learning function relations or understanding structures of data lying in manifolds embedded in huge dimensional Euclidean spaces is an important topic in learning theory. In this paper we study the approximation and learning by Gaussians of functions defined on a d-dimensional connected compact C∞ Riemannian submanifold of ${\rm I\!R}^n$ which is isometrically embedded. We show that the convolution with the Gaussian kernel with variance σ provides the uniform approximation order of O(σs) when the approximated function is Lipschitz s ∈(0, 1]. The uniform normal neighborhoods of a compact Riemannian manifold play a central role in deriving the approximation order. This approximation result is used to investigate the regression learning algorithm generated by the multi-kernel least square regularization scheme associated with Gaussian kernels with flexible variances. When the regression function is Lipschitz s, our learning rate is (log2m)/m)s/(8 s + 4 d) where m is the sample size. When the manifold dimension d is smaller than the dimension n of the underlying Euclidean space, this rate is much faster compared with those in the literature. By comparing approximation orders, we also show the essential difference between approximation schemes with flexible variances and those with a single variance.

[1]  Lorenzo Rosasco,et al.  Model Selection for Regularized Least-Squares Algorithm in Learning Theory , 2005, Found. Comput. Math..

[2]  Ding-Xuan Zhou,et al.  Capacity of reproducing kernel spaces in learning theory , 2003, IEEE Transactions on Information Theory.

[3]  Yiming Ying,et al.  Multi-kernel regularized classifiers , 2007, J. Complex..

[4]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[5]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[6]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[7]  Constantin F. Aliferis,et al.  A theoretical characterization of linear SVM-based feature selection , 2004, ICML '04.

[8]  Ingo Steinwart,et al.  Fast Rates for Support Vector Machines , 2005, COLT.

[9]  Ding-Xuan Zhou,et al.  Learning Theory: An Approximation Theory Viewpoint , 2007 .

[10]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[11]  Yiming Ying,et al.  Learnability of Gaussians with Flexible Variances , 2007, J. Mach. Learn. Res..

[12]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[13]  Yiming Ying,et al.  Support Vector Machine Soft Margin Classifiers: Error Analysis , 2004, J. Mach. Learn. Res..

[14]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[15]  Emmanuel Hebey,et al.  Sobolev Spaces on Riemannian Manifolds , 1996 .

[16]  Ding-Xuan Zhou,et al.  Learning Theory: From Regression to Classification , 2006 .

[17]  V. Koltchinskii,et al.  Empirical graph Laplacian approximation of Laplace–Beltrami operators: Large sample results , 2006, math/0612777.

[18]  S. Smale,et al.  Learning Theory Estimates via Integral Operators and Their Approximations , 2007 .

[19]  Ding-Xuan Zhou,et al.  Fully online classification by regularization , 2007 .

[20]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[21]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[22]  Mikhail Belkin,et al.  Towards a theoretical foundation for Laplacian-based manifold methods , 2005, J. Comput. Syst. Sci..

[23]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[24]  Ding-Xuan Zhou,et al.  The covering number in learning theory , 2002, J. Complex..

[25]  I. Holopainen Riemannian Geometry , 1927, Nature.

[26]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Ulrike von Luxburg,et al.  From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[28]  V. Totik,et al.  Moduli of smoothness , 1987 .

[29]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics) , 2007 .

[30]  Ding-Xuan Zhou,et al.  SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming , 2005, Neural Computation.

[31]  W. Boothby An introduction to differentiable manifolds and Riemannian geometry , 1975 .