Community detection in networks via nonlinear modularity eigenvectors

Revealing a community structure in a network or dataset is a central problem arising in many scientific areas. The modularity function $Q$ is an established measure quantifying the quality of a community, being identified as a set of nodes having high modularity. In our terminology, a set of nodes with positive modularity is called a \textit{module} and a set that maximizes $Q$ is thus called \textit{leading module}. Finding a leading module in a network is an important task, however the dimension of real-world problems makes the maximization of $Q$ unfeasible. This poses the need of approximation techniques which are typically based on a linear relaxation of $Q$, induced by the spectrum of the modularity matrix $M$. In this work we propose a nonlinear relaxation which is instead based on the spectrum of a nonlinear modularity operator $\mathcal M$. We show that extremal eigenvalues of $\mathcal M$ provide an exact relaxation of the modularity measure $Q$, however at the price of being more challenging to be computed than those of $M$. Thus we extend the work made on nonlinear Laplacians, by proposing a computational scheme, named \textit{generalized RatioDCA}, to address such extremal eigenvalues. We show monotonic ascent and convergence of the method. We finally apply the new method to several synthetic and real-world data sets, showing both effectiveness of the model and performance of the method.

[1]  Hongyu Zhao,et al.  Normalized modularity optimization method for community identification with degree adjustment. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Xavier Bresson,et al.  Multiclass Total Variation Clustering , 2013, NIPS.

[3]  Matthias Hein,et al.  An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA , 2010, NIPS.

[4]  Santo Fortunato,et al.  Limits of modularity maximization in community detection , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  F. Chung,et al.  Complex Graphs and Networks , 2006 .

[6]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[7]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[8]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  P. Mucha,et al.  Spectral tripartitioning of networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Ernesto Estrada,et al.  Predicting Triadic Closure in Networks Using Communicability Distance Functions , 2014, SIAM J. Appl. Math..

[11]  Dario Fasino,et al.  An Algebraic Analysis of the Graph Modularity , 2013, SIAM J. Matrix Anal. Appl..

[12]  Dario Fasino,et al.  Generalized modularity matrices , 2015, ArXiv.

[13]  Mason A. Porter,et al.  Comparing Community Structure to Characteristics in Online Collegiate Social Networks , 2008, SIAM Rev..

[14]  R. Tyrrell Rockafellar,et al.  Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.

[15]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[16]  Matthias Hein,et al.  Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts , 2011, NIPS.

[17]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[18]  Dario Fasino,et al.  Modularity bounds for clusters located by leading eigenvectors of the normalized modularity matrix , 2016, ArXiv.

[19]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..

[20]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[21]  Natasa Przulj,et al.  Modelling protein–protein interaction networks via a stickiness index , 2006, Journal of The Royal Society Interface.

[22]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[23]  Xue-Cheng Tai,et al.  Simplified Energy Landscape for Modularity Using Total Variation , 2017, SIAM J. Appl. Math..

[24]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[25]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Ernesto Estrada,et al.  The Structure of Complex Networks: Theory and Applications , 2011 .

[27]  Patrick J. Wolfe,et al.  Moments of parameter estimates for Chung-Lu random graph models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  V A Traag,et al.  Narrow scope for resolution-limit-free community detection. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[30]  P. Ronhovde,et al.  Local resolution-limit-free Potts model for community detection. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Mason A. Porter,et al.  A Method Based on Total Variation for Network Modularity Optimization Using the MBO Scheme , 2013, SIAM J. Appl. Math..

[33]  Mason A. Porter,et al.  Communities in Networks , 2009, ArXiv.

[34]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[35]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[36]  Xueqi Cheng,et al.  Spectral methods for the detection of network community structure: a comparative analysis , 2010, ArXiv.

[37]  K. C. Chang,et al.  Spectrum of the 1‐Laplacian and Cheeger's Constant on Graphs , 2014, J. Graph Theory.

[38]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Matthias Hein,et al.  A nodal domain theorem and a higher-order Cheeger inequality for the graph $p$-Laplacian , 2016, Journal of Spectral Theory.

[40]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[41]  Matthias Hein,et al.  Spectral clustering based on the graph p-Laplacian , 2009, ICML '09.

[42]  Pablo M. Gleiser,et al.  Community Structure in Jazz , 2003, Adv. Complex Syst..

[43]  Ana L. N. Fred,et al.  Learning Pairwise Similarity for Data Clustering , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[44]  Matthias Hein,et al.  The Power Mean Laplacian for Multilayer Graph Clustering , 2018, AISTATS.

[45]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[46]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[47]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[48]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[49]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[50]  J. Hiriart-Urruty,et al.  Fundamentals of Convex Analysis , 2004 .

[51]  Pavel Drábek,et al.  On the Generalization of the Courant Nodal Domain Theorem , 2002 .

[52]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.