Distance-Preserving Projection of High-Dimensional Data for Nonlinear Dimensionality Reduction

A distance-preserving method is presented to map high-dimensional data sequentially to low-dimensional space. It preserves exact distances of each data point to its nearest neighbor and to some other near neighbors. Intrinsic dimensionality of data is estimated by examining the preservation of interpoint distances. The method has no user-selectable parameter. It can successfully project data when the data points are spread among multiple clusters. Results of experiments show its usefulness in projecting high-dimensional data.

[1]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[2]  Robert P. W. Duin,et al.  An Evaluation of Intrinsic Dimensionality Estimators , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Amaury Lendasse,et al.  A robust nonlinear projection method , 2000 .

[4]  Gerald Sommer,et al.  Intrinsic Dimensionality Estimation With Optimally Topology Preserving Maps , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Joseph B. Kruskal Comments on "A Nonlinear Mapping for Data Structure Analysis" , 1971, IEEE Trans. Computers.

[6]  Michel Verleysen,et al.  A robust non-linear projection method , 2000, ESANN.

[7]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[8]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[9]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[10]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[12]  Robert S. Bennett,et al.  The intrinsic dimensionality of signal collections , 1969, IEEE Trans. Inf. Theory.

[13]  Heinrich Niemann,et al.  A Fast-Converging Algorithm for Nonlinear Mapping of High-Dimensional Data to a Plane , 1979, IEEE Transactions on Computers.

[14]  Gerard V. Trunk tatistical Estimation oftheIntrinsic Dimensionality ofaNoisy Signal Collection , 1976 .

[15]  Gautam Biswas,et al.  Evaluation of Projection Algorithms , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[18]  Keinosuke Fukunaga,et al.  Representation of Nonlinear Data Surfaces , 1973, IEEE Transactions on Computers.

[19]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[20]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[21]  Richard C. T. Lee,et al.  A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space , 1977, IEEE Transactions on Computers.

[22]  Dimitrios Gunopulos,et al.  Non-linear dimensionality reduction techniques for classification and visualization , 2002, KDD.

[23]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[24]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[25]  R. Prim Shortest connection networks and some generalizations , 1957 .

[26]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[27]  Teuvo Kohonen,et al.  Self-Organizing Maps, Second Edition , 1997, Springer Series in Information Sciences.