Assessing the quality of multilevel graph clustering

Abstract“Lifting up” a non-hierarchical approach to handle hierarchical clustering by iteratively applying the approach to hierarchically cluster a graph is a popular strategy. However, these lifted iterative strategies cannot reasonably guide the overall nesting process precisely because they fail to evaluate the very hierarchical character of the clustering they produce. In this study, we develop a criterion that can evaluate the quality of the subgraph hierarchy. The multilevel criterion we present and discuss in this paper generalizes a measure designed for a one-level (flat) graph clustering to take nesting of the clusters into account. We borrow ideas from standard techniques in algebraic combinatorics and exploit a variable $$q$$q to keep track of the depth of clusters at which edges occur. Our multilevel measure relies on a recursive definition involving variable $$q$$q outputting a one-variable polynomial. This paper examines archetypal examples as proofs-of-concept; these simple cases are useful in understanding how the multilevel measure actually works. We also apply this multilevel modularity to real world networks to demonstrate how it can be used to compare hierarchical clusterings of graphs.

[1]  G. Melançon,et al.  Réseaux Multi-Niveaux : L'Exemple des Echanges Aériens Mondiaux de Passagers , 2005 .

[2]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[4]  Yves Chiricota,et al.  Visualization-based communities discovering in commuting networks : a case study , 2011, Symbolic Data Analysis and Visualization.

[5]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[6]  D. Cook,et al.  Graph-based hierarchical conceptual clustering , 2002 .

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  Paul Erdös,et al.  On random graphs, I , 1959 .

[9]  Pascal Pons,et al.  Post-processing hierarchical community structures: Quality improvements and multi-scale view , 2006, Theor. Comput. Sci..

[10]  Romain Bourqui,et al.  Winding Roads: Routing edges into bundles , 2010, Comput. Graph. Forum.

[11]  Peter Nijkamp,et al.  Living in Two Worlds: A Review of Home-to-Work Decisions , 2004 .

[12]  Marni Mishna Attribute grammars and automatic complexity analysis , 2003, Adv. Appl. Math..

[13]  Peter Nijkamp,et al.  Network Analysis of Commuting Flows: A Comparative Static Approach to German Data , 2007 .

[14]  Mountaz Hascoët,et al.  Cluster validity indices for graph partitioning , 2004, Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004..

[15]  Guy Melançon,et al.  A Quality Measure for Multi-Level Community Structure , 2006, 2006 Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[16]  David Auber,et al.  Tulip - A Huge Graph Visualization Framework , 2004, Graph Drawing Software.

[17]  Alessandro Vespignani,et al.  Evolution thinks modular , 2003, Nature Genetics.

[18]  Géraldine Pflieger,et al.  Introduction. Urban Networks and Network Theory: The City as the Connector of Multiple Networks , 2010 .

[19]  Ulrik Brandes,et al.  Engineering graph clustering: Models and experimental evaluation , 2008, JEAL.

[20]  B. Victorri,et al.  Hierarchy in lexical organisation of natural languages , 2006 .

[21]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  M. Boullé,et al.  Data Grid Models for Preparation and Modeling in Supervised Learning Data Grid Models for Preparation and Modeling in Supervised Learning , 2010 .

[23]  Alessandro Chessa,et al.  Commuter networks and community detection: A method for planning sub regional areas , 2011, ArXiv.

[24]  Srinivasan Parthasarathy,et al.  Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.

[25]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Lawrence B. Holder,et al.  Mining Graph Data: Cook/Mining Graph Data , 2006 .

[28]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[29]  Jean-Philippe Domenger,et al.  Exploring InfoVis Publication History with Tulip , 2004 .

[30]  F. Boutin,et al.  Cluster validity indices for graph partitioning , 2004 .

[31]  Guy Melançon,et al.  Multiscale visualization of small world networks , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[32]  Lawrence B. Holder,et al.  Mining Graph Data , 2006 .

[33]  D. Pumain Hierarchy in natural and social sciences , 2006 .

[34]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[35]  Michael Batty,et al.  Hierarchy in Cities and City Systems , 2006 .

[36]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[37]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[38]  Jean-Marc Fedou,et al.  Attibute Grammars are Useful for Combinatorics , 1992, Theor. Comput. Sci..

[39]  T. S. Evans,et al.  Clique graphs and overlapping communities , 2010, ArXiv.