Error Analysis of Laplacian Eigenmaps for Semi-supervised Learning

We study the error and sample complexity of semi-supervised learning by Laplacian Eignmaps at the limit of infinite unlabeled data. We provide a bound on the error, and show that it is controlled by the graph Laplacian regularizer. Our analysis also gives guidance to the choice of the number of eigenvectors k to use: when the data lies on a d-dimensional domain, the optimal choice of k is of order (n/ log(n)) d d+2 , yielding an asymptotic error rate of (n/ log(n))− 2 2+d .

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  G. M.,et al.  Partial Differential Equations I , 2023, Applied Mathematical Sciences.

[3]  W. J. Studden,et al.  Asymptotic Integrated Mean Square Error Using Least Squares and Bias Minimizing Splines , 1980 .

[4]  P. Bickel,et al.  Local polynomial regression on unknown manifolds , 2007, 0708.0983.

[5]  Ulrike von Luxburg,et al.  From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[6]  Peter L. Bartlett,et al.  The importance of convexity in learning with squared loss , 1998, COLT '96.

[7]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[8]  Peter L. Bartlett,et al.  The Importance of Convexity in Learning with Squared Loss , 1998, IEEE Trans. Inf. Theory.

[9]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[10]  Matthias Hein,et al.  Geometrical aspects of statistical learning theory , 2005 .

[11]  Yu Safarov,et al.  The Asymptotic Distribution of Eigenvalues of Partial Differential Operators , 1996 .

[12]  Michael Taylor,et al.  Partial Differential Equations I: Basic Theory , 1996 .

[13]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[14]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[15]  G. Wahba Spline models for observational data , 1990 .

[16]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[17]  Ronald R. Coifman,et al.  Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators , 2005, NIPS.

[18]  Mikhail Belkin,et al.  Towards a theoretical foundation for Laplacian-based manifold methods , 2005, J. Comput. Syst. Sci..

[19]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[20]  Nathan Srebro,et al.  Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data , 2009, NIPS.

[21]  Regina Y. Liu,et al.  Complex datasets and inverse problems : tomography, networks and beyond , 2007, 0708.1130.

[22]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[23]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[24]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.