Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures

This paper examines a schema for graph-theoretic clustering using node-based resilience measures. Node-based resilience measures optimize an objective based on a critical set of nodes whose removal causes some severity of disconnection in the network. Beyond presenting a general framework for the usage of node based resilience measures for variations of clustering problems, we emphasize the unique potential of such methods to accomplish the following properties: (i) clustering a graph in one step without knowing the number of clusters a priori, and (ii) removing noise from noisy data. We first present results of clustering experiments using a β-parametrized generalization of vertex attack tolerance, showing high clustering accuracy for both real datasets and equal density synthetic data sets, as well as successful removal of noise nodes. It is shown that arbitrarily increasing β increases the number of noise nodes removed in some cases, and that internal validation measures can be used to determine the correct number of clusters in a class of datasets. Further results are presented using five different resilience measures with a general node-based resilience clustering technique. In a subset of cases a resilience measure, such as integrity, is able to cluster to high accuracy in one step, giving the correct clustering while also determining the correct number of clusters. Integrity is also shown to be promising with respect to noise removal, removing up to 80% of noise on some datasets.

[1]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  S. Louis Hakimi,et al.  Recognizing tough graphs is NP-hard , 1990, Discret. Appl. Math..

[3]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[4]  H. A. Jung On Maximal Circuits in Finite Graphs , 1978 .

[5]  Prasad Raghavendra,et al.  The Complexity of Approximating Vertex Expansion , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[6]  Gunes Ercal,et al.  Resilience Notions for Scale-free Networks , 2013, Complex Adaptive Systems.

[7]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[8]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Gunes Ercal,et al.  A Graph-Theoretic Clustering Methodology Based on Vertex-Attack Tolerance , 2015, FLAIRS Conference.

[10]  Jirí Síma,et al.  On the NP-Completeness of Some Graph Cluster Measures , 2005, SOFSEM.

[11]  Gunes Ercal A Note on the Computational Complexity of Unsmoothened Vertex Attack Tolerance , 2016, ArXiv.

[12]  Andrew B. Kahng,et al.  Spectral Partitioning with Multiple Eigenvectors , 1999, Discret. Appl. Math..

[13]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Olatz Arbelaitz,et al.  An extensive comparative study of cluster validity indices , 2013, Pattern Recognit..

[15]  Jirí Fiala,et al.  Linear-Time Algorithms for Scattering Number and Hamilton-Connectivity of Interval Graphs , 2015, J. Graph Theory.

[16]  J. Bezdek,et al.  VAT: a tool for visual assessment of (cluster) tendency , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[17]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[18]  Gunes Ercal,et al.  The vertex attack tolerance of complex networks , 2017, RAIRO Oper. Res..

[19]  Ulrike von Luxburg,et al.  Influence of graph construction on graph-based clustering measures , 2008, NIPS.

[20]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[21]  Yuval Rabani,et al.  ON THE HARDNESS OF APPROXIMATING MULTICUT AND SPARSEST-CUT , 2005, 20th Annual IEEE Conference on Computational Complexity (CCC'05).

[22]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[23]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[24]  Pim van 't Hof,et al.  On the Computational Complexity of Vertex Integrity and Component Order Connectivity , 2014, Algorithmica.

[25]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[26]  Noga Alon,et al.  Eigenvalues and expanders , 1986, Comb..

[27]  Gunes Ercal On Vertex Attack Tolerance in Regular Graphs , 2014, ArXiv.

[28]  Gunes Ercal,et al.  Comparative Resilience Notions and Vertex Attack Tolerance of Scale-Free Networks , 2014, ArXiv.

[29]  Vasek Chvátal,et al.  Tough graphs and hamiltonian circuits , 1973, Discret. Math..