Node-Based Resilience Measure Clustering with Applications to Noisy and Overlapping Communities in Complex Networks

This paper examines a schema for graph-theoretic clustering using node-based resilience measures. Node-based resilience measures optimize an objective based on a critical set of nodes whose removal causes some severity of disconnection in the network. Beyond presenting a general framework for the usage of node based resilience measures for variations of clustering problems, we experimentally validate the usefulness of such methods in accomplishing the following: (i) clustering a graph in one step without knowing the number of clusters a priori; (ii) removing noise from noisy data; and (iii) detecting overlapping communities. We demonstrate that this clustering schema can be applied successfully using a wide range of data, including both real and synthetic networks, both natively in graph form and also expressed as point sets.

[1]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Cun-Quan Zhang,et al.  A new clustering method and its application in social networks , 2011, Pattern Recognit. Lett..

[5]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[6]  Gunes Ercal,et al.  Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  S. Louis Hakimi,et al.  Recognizing tough graphs is NP-hard , 1990, Discret. Appl. Math..

[9]  Mikolaj Morzy,et al.  Using Graph and Vertex Entropy to Compare Empirical Graphs with Theoretical Graph Models , 2016, Entropy.

[10]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[11]  H. A. Jung On Maximal Circuits in Finite Graphs , 1978 .

[12]  Jirí Síma,et al.  On the NP-Completeness of Some Graph Cluster Measures , 2005, SOFSEM.

[13]  Yan Zhang,et al.  Efficient and Scalable Detection of Overlapping Communities in Big Networks , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[14]  Pim van 't Hof,et al.  On the Computational Complexity of Vertex Integrity and Component Order Connectivity , 2014, ISAAC.

[15]  Fabio Tozeto Ramos,et al.  On Integrated Clustering and Outlier Detection , 2014, NIPS.

[16]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[17]  Yuichi Yoshida,et al.  Almost linear-time algorithms for adaptive betweenness centrality using hypergraph sketches , 2014, KDD.

[18]  Samir Khuller,et al.  Algorithms for facility location problems with outliers , 2001, SODA '01.

[19]  David A. Bader,et al.  Scalable and High Performance Betweenness Centrality on the GPU , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[20]  R. Ulanowicz,et al.  The Seasonal Dynamics of The Chesapeake Bay Ecosystem , 1989 .

[21]  Ulrike von Luxburg,et al.  Influence of graph construction on graph-based clustering measures , 2008, NIPS.

[22]  Balaraman Ravindran,et al.  Measuring network centrality using hypergraphs , 2015, CODS.

[23]  Gunes Ercal,et al.  Comparative Resilience Notions and Vertex Attack Tolerance of Scale-Free Networks , 2014, ArXiv.

[24]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[25]  Gunes Ercal,et al.  Resilience Notions for Scale-free Networks , 2013, Complex Adaptive Systems.

[26]  Vasek Chvátal,et al.  Tough graphs and hamiltonian circuits , 1973, Discret. Math..

[27]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[28]  Andrew B. Kahng,et al.  Spectral Partitioning with Multiple Eigenvectors , 1999, Discret. Appl. Math..

[29]  Gunes Ercal,et al.  The Diversity of REcent and Ancient huMan (DREAM): A New Microarray for Genetic Anthropology and Genealogy, Forensics, and Personalized Medicine , 2017, Genome biology and evolution.

[30]  Mark Jerrum,et al.  Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains , 1987, International Workshop on Graph-Theoretic Concepts in Computer Science.

[31]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[32]  Malik Magdon-Ismail,et al.  Finding communities by clustering a graph into overlapping subgraphs , 2005, IADIS AC.

[33]  Aristides Gionis,et al.  k-means-: A Unified Approach to Clustering and Outlier Detection , 2013, SDM.

[34]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[35]  John Matta,et al.  A Comparison of Approaches to Computing Betweenness Centrality for Large Graphs , 2017, COMPLEX NETWORKS.

[36]  Yuval Rabani,et al.  ON THE HARDNESS OF APPROXIMATING MULTICUT AND SPARSEST-CUT , 2005, 20th Annual IEEE Conference on Computational Complexity (CCC'05).

[37]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[38]  Gunes Ercal,et al.  A Graph-Theoretic Clustering Methodology Based on Vertex-Attack Tolerance , 2015, FLAIRS Conference.

[39]  David J. Foran,et al.  Using Betweenness Centrality to Identify Manifold Shortcuts , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[40]  Gunes Ercal,et al.  Analysis of grapevine gene expression data using node-based resilience clustering , 2018, 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[41]  Sanjeev Arora,et al.  Finding overlapping communities in social networks: toward a rigorous approach , 2011, EC '12.

[42]  Samir Khuller,et al.  Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity , 2008, APPROX-RANDOM.

[43]  Pim van 't Hof,et al.  On the Computational Complexity of Vertex Integrity and Component Order Connectivity , 2014, Algorithmica.

[44]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Sanjeev Arora,et al.  Expander flows, geometric embeddings and graph partitioning , 2009, JACM.

[46]  Osmar R. Zaïane,et al.  Generating Attributed Networks with Communities , 2015, PloS one.

[47]  John West,et al.  Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2016, SC.

[48]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Xinbo Ai,et al.  Node Importance Ranking of Complex Networks with Entropy Variation , 2017, Entropy.

[50]  Jirí Fiala,et al.  Linear-Time Algorithms for Scattering Number and Hamilton-Connectivity of Interval Graphs , 2015, J. Graph Theory.

[51]  Matteo Pellegrini,et al.  Detecting Communities Based on Network Topology , 2014, Scientific Reports.

[52]  Mehdi Ellouze,et al.  Community detection in social network: Literature review and research perspectives , 2015, 2015 IEEE International Conference on Service Operations And Logistics, And Informatics (SOLI).

[53]  Olatz Arbelaitz,et al.  An extensive comparative study of cluster validity indices , 2013, Pattern Recognit..

[54]  Tamara G. Kolda,et al.  A Scalable Generative Graph Model with Community Structure , 2013, SIAM J. Sci. Comput..

[55]  Donald C. Wunsch,et al.  Sorting the phenotypic heterogeneity of autism spectrum disorders: A hierarchical clustering model , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[56]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[57]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[58]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[59]  A. Madansky Identification of Outliers , 1988 .

[60]  T. Vicsek,et al.  Clique percolation in random networks. , 2005, Physical review letters.

[61]  Gunes Ercal,et al.  The vertex attack tolerance of complex networks , 2017, RAIRO Oper. Res..

[62]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[63]  Gunes Ercal On Vertex Attack Tolerance in Regular Graphs , 2014, ArXiv.

[64]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[65]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[66]  Gunes Ercal A Note on the Computational Complexity of Unsmoothened Vertex Attack Tolerance , 2016, ArXiv.