Graph Clustering by Maximizing Statistical Association Measures

We are interested in objective functions for clustering undirected and unweighted graphs. Our goal is to define alternatives to the popular modularity measure. To this end, we propose to adapt statistical association coefficients, which traditionally measure the proximity between partitions, for graph clustering. Our approach relies on the representation of statistical association measures in a relational formulation which uses the adjacency matrices of the equivalence relations underlying the partitions. We show that graph clustering can then be solved by fitting the graph with an equivalence relation via the maximization of a statistical association coefficient. We underline the connections between the proposed framework and the modularity model. Our theoretical work comes with an empirical study on computer-generated graphs. Our results show that the proposed methods can recover the community structure of a graph similarly or better than the modularity.

[1]  Zeeshan-ul-hassan Usmani Web Intelligence and Intelligent Agents , 2010 .

[2]  Julien Ah-Pine Cluster Analysis Based on the Central Tendency Deviation Principle , 2009, ADMA.

[3]  Julien Ah-Pine,et al.  Overview of the Relational Analysis approach in Data-Mining and Multi-criteria Decision Making , 2010 .

[4]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[5]  William A. Belson,et al.  Matching and Prediction on the Principle of Biological Classification , 1959 .

[6]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Santo Fortunato,et al.  Limits of modularity maximization in community detection , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  David Kempe,et al.  Modularity-maximizing graph communities via mathematical programming , 2007, 0710.2533.

[9]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[10]  Charles Jordan,et al.  Les Coefficients d'Intensite Relative de Korosy. , 1930 .

[11]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[15]  Santo Fortunato,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[16]  B. Margolin,et al.  An Analysis of Variance for Categorical Data , 1971 .