Hierarchical Quasi-Clustering Methods for Asymmetric Networks

This paper introduces hierarchical quasi-clustering methods, a generalization of hierarchical clustering for asymmetric networks where the output structure preserves the asymmetry of the input data. We show that this output structure is equivalent to a finite quasi-ultrametric space and study admissibility with respect to two desirable properties. We prove that a modified version of single linkage is the only admissible quasi-clustering method. Moreover, we show stability of the proposed method and we establish invariance properties fulfilled by it. Algorithms are further developed and the value of quasi-clustering analysis is illustrated with a study of internal migration within United States.

[1]  L. Hubert Min and max hierarchical clustering using asymmetric similarity measures , 1973 .

[2]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[3]  Fionn Murtagh,et al.  Multidimensional clustering algorithms , 1985 .

[4]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[5]  M. Gromov Metric Structures for Riemannian and Non-Riemannian Spaces , 1999 .

[6]  Paul B. Slater Hierarchical Internal Migration Regions of France , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Santiago Segarra,et al.  Axiomatic construction of hierarchical clustering in asymmetric networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Raphael Yuster,et al.  All Pairs Bottleneck Paths and Max-Min Matrix Products in Truly Subcubic Time , 2009, Theory Comput..

[9]  Facundo Mémoli,et al.  Department of Mathematics , 1894 .

[10]  Ran Duan,et al.  Fast algorithms for (max, min)-matrix multiplication and bottleneck shortest paths , 2009, SODA.

[11]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[12]  Vladimir Gurvich,et al.  Characterizing (quasi-)ultrametric finite spaces in terms of (directed) graphs , 2012, Discret. Appl. Math..

[13]  David J. Marchette Data Analysis of Asymmetric Structures: Advanced Approaches in Computational Statistics , 2006, Technometrics.

[14]  Michel Minoux,et al.  Graphs, dioids and semirings : new models and algorithms , 2008 .

[15]  Marina Meila,et al.  Spectral Clustering of Biological Sequence Data , 2005, AAAI.

[16]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[18]  J. Jackson Wiley Series in Probability and Mathematical Statistics , 2004 .

[19]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[20]  Facundo Mémoli,et al.  Classifying Clustering Schemes , 2010, Foundations of Computational Mathematics.

[21]  Robert E. Tarjan,et al.  An Improved Algorithm for Hierarchical Clustering Using Strong Components , 1983, Inf. Process. Lett..

[22]  P B Slater,et al.  A Partial Hierarchical Regionalization of 3140 US Counties on the Basis of 1965–1970 Intercounty Migration , 1984, Environment & planning A.

[23]  Tero Harju,et al.  Ordered Sets , 2001 .

[24]  Marina Meila,et al.  Clustering by weighted cuts in directed graphs , 2007, SDM.

[25]  Boyd Jp,et al.  Asymmetric clusters of internal migration regions of France , 1980 .

[26]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.