Building k-connected neighborhood graphs for isometric data embedding

Isometric data embedding using geodesic distance requires the construction of a connected neighborhood graph so that the geodesic distance between every pair of data points can be estimated. This paper proposes an approach for constructing k-connected neighborhood graphs. The approach works by applying a greedy algorithm to add each edge, in a nondecreasing order of edge length, to a neighborhood graph if end vertices of the edge are not yet k-connected on the graph. The k-connectedness between vertices is tested using a network flow technique by assigning every vertex a unit flow capacity. This approach is applicable to a wide range of data. Experiments show that it gives better estimation of geodesic distances than other approaches, especially when the data are undersampled or nonuniformly distributed.

[1]  Robert E. Tarjan,et al.  Testing graph connectivity , 1974, STOC '74.

[2]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[3]  Michel Verleysen,et al.  Nonlinear projection with curvilinear distances: Isomap versus curvilinear distance analysis , 2004, Neurocomputing.

[4]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[5]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[6]  S. R. Das Comments on "A New Algorithm for Generating Prime Implicants" , 1971, IEEE Trans. Computers.

[7]  Li Yang Building k edge-disjoint spanning trees of minimum total length for isometric data embedding , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Li Yang,et al.  Distance-Preserving Projection of High-Dimensional Data for Nonlinear Dimensionality Reduction , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  Li Yang,et al.  K-edge connected neighborhood graph for geodesic distance estimation and nonlinear data projection , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[11]  Li Yang Building k-edge-connected neighborhood graph for distance-based data projection , 2005, Pattern Recognit. Lett..

[12]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[13]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[14]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[15]  E. A. Dinic Algorithm for solution of a problem of maximal flow in a network with power estimation , 1970 .

[16]  David W. Matula,et al.  k-Blocks and ultrablocks in graphs , 1978, J. Comb. Theory, Ser. B.

[17]  Hongyuan Zha,et al.  Isometric Embedding and Continuum ISOMAP , 2003, ICML.

[18]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[19]  Robert E. Tarjan,et al.  Network Flow and Testing Graph Connectivity , 1975, SIAM J. Comput..

[20]  Li Yang K-edge connected neighborhood graph for geodesic distance estimation and nonlinear data projection , 2004, ICPR 2004.

[21]  Joseph B. Kruskal Comments on "A Nonlinear Mapping for Data Structure Analysis" , 1971, IEEE Trans. Computers.