Topological Decomposition and Heuristics for High Speed Clustering of Complex Networks

With the exponential growth in the size of data and networks, development of new and fast techniques to analyze and explore these networks is becoming a necessity. Moreover the emergence of scale free and small world properties in real world networks has stimulated lots of activity in the field of network analysis and data mining. Clustering remains a fundamental technique to explore and organize these networks. A challenging problem is to find a clustering algorithm that works well in terms of clustering quality and is efficient in terms of time complexity. In this paper, we propose a fast clustering algorithm which combines some heuristics with a Topological Decomposition to obtain a clustering. The algorithm which we call Topological Decomposition and Heuristics for Clustering (TDHC) is highly efficient in terms of asymptotic time complexity as compared to other existing algorithms in the literature. We also introduce a number of Heuristics to complement the clustering algorithm which increases the speed of the clustering process maintaining the high quality of clustering. We show the effectiveness of the proposed clustering method on different real world data sets and compare its results with well known clustering algorithms.

[1]  Guy Melançon,et al.  Multiscale visualization of small world networks , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[2]  Charles K. Alexander,et al.  Fundamentals of Electric Circuits , 1999 .

[3]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[4]  M E Newman,et al.  Scientific collaboration networks. I. Network construction and fundamental results. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[6]  Jarke J. van Wijk,et al.  Interactive Visualization of Small World Graphs , 2004, IEEE Symposium on Information Visualization.

[7]  V. Eguíluz,et al.  Growing scale-free networks with small-world behavior. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[9]  Fang Wu,et al.  Finding communities in linear time: a physics approach , 2003, ArXiv.

[10]  Guy Melançon,et al.  Identifying the presence of communities in complex networks through topological decomposition and component densities , 2010, EGC.

[11]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[12]  V. Latora,et al.  Detecting complex network modularity by dynamical clustering. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[14]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[15]  Siddhartha R. Jonnalagadda,et al.  Scientific collaboration networks using biomedical text. , 2014, Methods in molecular biology.

[16]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[17]  D. Spielman,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[18]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Christos Gkantsidis,et al.  On the Semantics of Internet topologies , 2002 .

[20]  Michael Jünger,et al.  Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm , 2004, GD.

[21]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Jiawei Han,et al.  Mining scale-free networks using geodesic clustering , 2004, KDD.

[24]  Guy Melançon,et al.  Just how dense are dense graphs in the real world?: a methodological note , 2006, BELIV '06.

[25]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[26]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[27]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.