Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods

Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the high- and low-dimensional spaces leads to the definition of the co-ranking matrix. Rank errors and concepts such as neighborhood intrusions and extrusions can be associated with different blocks of the co-ranking matrix. The considered quality criteria are then cast within this unifying framework and the blocks they involve are identified. The same framework allows us to propose simpler criteria, which quantify two aspects of the embedding, namely its overall quality and its tendency to favor either intrusions or extrusions. Eventually, a simple experiment illustrates the soundness of the approach.

[1]  A. Householder,et al.  Discussion of a set of points in terms of their mutual distances , 1938 .

[2]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[3]  Klaus Pawelzik,et al.  Quantifying the neighborhood preservation of self-organizing feature maps , 1992, IEEE Trans. Neural Networks.

[4]  Yoshua Bengio,et al.  Spectral Clustering and Kernel PCA are Learning Eigenfunctions , 2003 .

[5]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[7]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[8]  Michel Verleysen,et al.  Rank-based quality assessment of nonlinear dimensionality reduction , 2008, ESANN.

[9]  Jarkko Venna,et al.  Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study , 2001, ICANN.

[10]  Jarkko Venna,et al.  Local multidimensional scaling , 2006, Neural Networks.

[11]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[12]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[13]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[14]  A. Buja,et al.  Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing, and Proximity Analysis , 2009 .

[15]  Jeanny Hérault,et al.  Curvilinear Component Analysis for High-Dimensional Data Representation: I. Theoretical Aspects and Practical Use in the Presence of Noise , 1999, IWANN.

[16]  Michel Verleysen,et al.  Curvilinear Distance Analysis versus Isomap , 2002, ESANN.

[17]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, CVPR.

[18]  Jarkko Venna,et al.  Dimensionality reduction for visual exploration of similarity structures , 2007 .

[19]  Anne Guérin-Dugué,et al.  Curvilinear Component Analysis for High-Dimensional Data Representation: II. Examples of Additional Mapping Constraints in Specific Applications , 1999, IWANN.

[20]  Jarkko Venna,et al.  Nonlinear Dimensionality Reduction as Information Retrieval , 2007, AISTATS.

[21]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[22]  François Fouss,et al.  The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering , 2004, ECML.

[23]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[24]  Keinosuke Fukunaga 15 Intrinsic dimensionality extraction , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[25]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[26]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[27]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[28]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[29]  Kun Huang,et al.  A unifying theorem for spectral embedding and clustering , 2003, AISTATS.

[30]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[31]  Y. Wong,et al.  Differentiable Manifolds , 2009 .

[32]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[33]  Thomas Villmann,et al.  Topology preservation in self-organizing feature maps: exact definition and measurement , 1997, IEEE Trans. Neural Networks.