COMMUNITY STRUCTURE DISCOVERY IN DIRECTED GRAPHS BY ASYMMETRIC CLUSTERING

A familiar problem with respect to the analysis of network data (in which relations between objects can be described by links between the vertices of a graph) is the discovery of so-called community structures, i.e., the detection of subgraphs of closely connected vertices with comparatively few links joining vertices of different subgraphs. For this task modularity is a popular goodness-of-fit-index. While undirected graphs restrict considerations to basically symmetric relations, more realistic situations can be described by directed graphs. In this paper we consider shortest walk lengths between all pairs of vertices as dissimilarities instead of just using the adjacency information given by the directed edges of the graphs, which enables us to suggest a new approach in which the application of asymmetric clustering is a main step. This enrichment of the underlying adjacency matrix to a walk-length based dissimilarity matrix together with asymmetric hierarchical clustering are the keys of our proposed approach to community structure discovery in directed graphs. We use example graphs from the literature with known modularity values and apply computer-generated directed benchmark graphs for the evaluations. The findings show that our approach compares favourably with results available from the literature.

[1]  David Kempe,et al.  Modularity-maximizing graph communities via mathematical programming , 2007, 0710.2533.

[2]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[3]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[5]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[6]  E A Leicht,et al.  Community structure in directed networks. , 2007, Physical review letters.

[7]  Li Ma,et al.  Scalable Community Discovery of Large Networks , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[8]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[9]  Luonan Chen,et al.  Quantitative function for community detection. , 2008 .

[10]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of community hierarchies in large networks , 2008, ArXiv.

[12]  Hristo Djidjev,et al.  A Scalable Multilevel Algorithm for Graph Clustering and Community Structure Detection , 2007, WAW.

[13]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[14]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Ke Hu,et al.  A class of improved algorithms for detecting communities in complex networks , 2008 .

[16]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Spectral methods for graph clustering - A survey , 2011, Eur. J. Oper. Res..

[17]  Andreas Geyer-Schulz,et al.  Randomized Greedy Modularity Optimization for Group Detection in Huge Social Networks , 2010 .

[18]  Akinori Okada,et al.  UNIVERSITY ENROLLMENT FLOW AMONG THE JAPANESE PREFECTURES: A Comparison before and after the Joint First Stage Achievement Test by Asymmetric Cluster Analysis , 1996 .

[19]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Stephen Warshall,et al.  A Theorem on Boolean Matrices , 1962, JACM.

[21]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[22]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[26]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[29]  Xiaoke Ma,et al.  Semi-supervised clustering algorithm for community structure detection in complex networks , 2010 .

[30]  M. Newman,et al.  Robustness of community structure in networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Youngdo Kim,et al.  Finding communities in directed networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Eli V. Olinick,et al.  The use of sparsest cuts to reveal the hierarchical community structure of social networks , 2008, Soc. Networks.

[33]  Ulrik Brandes,et al.  Network Analysis: Methodological Foundations , 2010 .

[34]  Amedeo Caflisch,et al.  Multistep greedy algorithm identifies community structure in real-world and computer-generated networks , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[36]  L. Hubert Min and max hierarchical clustering using asymmetric similarity measures , 1973 .

[37]  Hiroshi Yadohisa,et al.  Asymmetric Agglomerative Hierarchical Clustering Algorithms and Their Evaluations , 2007, J. Classif..