On the Convergence of Eigenspaces in Kernel Principal Component Analysis

This paper presents a non-asymptotic statistical analysis of Kernel-PCA with a focus different from the one proposed in previous work on this topic. Here instead of considering the reconstruction error of KPCA we are interested in approximation error bounds for the eigenspaces themselves. We prove an upper bound depending on the spacing between eigenvalues but not on the dimensionality of the eigenspace. As a consequence this allows to infer stability results for these estimated spaces.

[1]  Tosio Kato Perturbation theory for linear operators , 1966 .

[2]  J. Dauxois,et al.  Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference , 1982 .

[3]  L. Elsner,et al.  The Hoffman-Wielandt inequality in infinite dimensions , 1994 .

[4]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[5]  V. Koltchinskii Asymptotics of Spectral Projections of Some Random Matrices Approximating Integral Operators , 1998 .

[6]  Christopher K. I. Williams,et al.  The Effect of the Input Density Distribution on Kernel-based Classifiers , 2000, ICML.

[7]  V. Koltchinskii,et al.  Random matrix approximation of spectra of integral operators , 2000 .

[8]  N. Cristianini,et al.  Estimating the moments of a random vector with applications , 2003 .

[9]  Gilles Blanchard,et al.  Kernel Projection Machine: a New Tool for Pattern Recognition , 2004, NIPS.

[10]  Gilles Blanchard,et al.  Statistical properties of Kernel Prinicipal Component Analysis , 2019 .

[11]  Ulrike von Luxburg,et al.  On the Convergence of Spectral Clustering on Random Samples: The Normalized Case , 2004, COLT.

[12]  Nello Cristianini,et al.  On the eigenspectrum of the gram matrix and the generalization error of kernel-PCA , 2005, IEEE Transactions on Information Theory.

[13]  Laurent Zwald Statistical properties of kernel principal component analysis , 2006, Machine Learning.