论文信息 - Spectral Dimensionality Reduction

Spectral Dimensionality Reduction

In this paper, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian Eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name 'spectral'). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semi-definite. This helps to generalize some of these algorithms so as to predict an embedding for out-of-sample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error. Dans cet article, nous etudions et developpons un cadre unifie pour un certain nombre de methodes non lineaires de reduction de dimensionalite, telles que LLE, Isomap, LE (Laplacian Eigenmap) et ACP a noyaux, qui font de la decomposition en valeurs propres (d'ou le nom "spectral""). Ce cadre inclut egalement des methodes classiques telles que l'ACP et l'echelonnage multidimensionnel metrique (MDS). Il inclut aussi l'etape de transformation de donnees utilisee dans l'agregation spectrale. Nous montrons que, dans tous les cas, l'algorithme d'apprentissage estime les fonctions propres principales d'un operateur qui depend de la densite inconnue de donnees et d'un noyau qui n'est pas necessairement positif semi-defini. Ce cadre aide a generaliser certains modeles pour predire les coordonnees des exemples hors-echantillons sans avoir a reentrainer le modele. Il aide egalement a rendre plus transparent ce que ces algorithmes minimisent sur les donnees empiriques et donne une notion correspondante d'erreur de generalisation."

[1] W. Torgerson. Multidimensional scaling: I. Theory and method , 1952 .

[2] J. Gower. Adding a point to vector diagrams in multivariate analysis , 1968 .

[3] E. Kreyszig. Introductory Functional Analysis With Applications , 1978 .

[4] R. Taylor,et al. The Numerical Treatment of Integral Equations , 1978 .

[5] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[6] Eric Saund,et al. Dimensionality-Reduction Using Connectionist Networks , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Teuvo Kohonen,et al. The self-organizing map , 1990 .

[8] Audra E. Kosh,et al. Linear Algebra and its Applications , 1992 .

[9] George Karypis,et al. Introduction to Parallel Computing , 1994 .

[10] Jonathan Baxter,et al. Learning internal representations , 1995, COLT '95.

[11] Geoffrey E. Hinton,et al. The EM algorithm for mixtures of factor analyzers , 1996 .

[12] Shang-Hua Teng,et al. Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[13] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[14] Fan Chung,et al. Spectral Graph Theory , 1996 .

[15] Bernhard Schölkopf,et al. Support vector learning , 1997 .

[16] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[18] V. Koltchinskii. Asymptotics of Spectral Projections of Some Random Matrices Approximating Integral Operators , 1998 .

[19] B. Schölkopf,et al. Advances in kernel methods: support vector learning , 1999 .

[20] Yair Weiss,et al. Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[22] Christopher K. I. Williams,et al. The Effect of the Input Density Distribution on Kernel-based Classifiers , 2000, ICML.

[23] V. Koltchinskii,et al. Random matrix approximation of spectra of integral operators , 2000 .

[24] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[25] Christopher K. I. Williams,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[26] Trevor F. Cox,et al. Metric multidimensional scaling , 2000 .

[27] Nello Cristianini,et al. On the Concentration of Spectral Properties , 2001, NIPS.

[28] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[29] John Shawe-Taylor,et al. The Stability of Kernel Principal Components Analysis and its Relation to the Process Eigenspectrum , 2002, NIPS.

[30] Mikhail Belkin,et al. Using manifold structure for partially labelled classification , 2002, NIPS 2002.

[31] Dimitrios Gunopulos,et al. Non-linear dimensionality reduction techniques for classification and visualization , 2002, KDD.

[32] Pascal Vincent,et al. Manifold Parzen Windows , 2002, NIPS.