论文信息 - On the Sample Complexity of Subspace Learning

On the Sample Complexity of Subspace Learning

A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribution, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.

Lorenzo Rosasco | Alessandro Rudi | Guillermo D. Cañas | L. Rosasco | Alessandro Rudi

[1] Lorenzo Rosasco,et al. Spectral Regularization for Support Estimation , 2010, NIPS.

[2] Massimiliano Pontil,et al. $K$ -Dimensional Coding Schemes in Hilbert Spaces , 2010, IEEE Transactions on Information Theory.

[3] J. Tropp. User-Friendly Tools for Random Matrices: An Introduction , 2012 .

[4] Patrick J. F. Groenen,et al. Modern Multidimensional Scaling: Theory and Applications , 2003 .

[5] Kilian Q. Weinberger,et al. Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6] D. Donoho,et al. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7] M. Naderi. Think globally... , 2004, HIV prevention plus!.

[8] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[9] Nicolas Le Roux,et al. Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[10] Bernhard Schölkopf,et al. Kernel Principal Component Analysis , 1997, ICANN.

[11] 古田孝之,et al. NORM INEQUALITIES EQUIVALENT TO LOWNER-HEINZ THEOREM , 1991 .

[12] Jean-Christophe Bourin. Some inequalities for norms on matrices and operators , 1999 .

[13] Christopher K. I. Williams. On a Connection between Kernel PCA and Metric Multidimensional Scaling , 2004, Machine Learning.

[14] Nello Cristianini,et al. On the eigenspectrum of the gram matrix and the generalization error of kernel-PCA , 2005, IEEE Transactions on Information Theory.

[15] Stephen P. Boyd,et al. The Fastest Mixing Markov Process on a Graph and a Connection to a Maximum Variance Unfolding Problem , 2006, SIAM Rev..

[16] Gerald Beer,et al. Topologies on Closed and Closed Convex Sets , 1993 .

[17] T. Andô,et al. Norm inequalities related to operator monotone functions , 1999 .