Spectral density-based clustering algorithms for complex networks

Introduction Clustering is usually the first exploratory analysis step in empirical data. When the data set comprises graphs, the most common approaches focus on clustering its vertices. In this work, we are interested in grouping networks with similar connectivity structures together instead of grouping vertices of the graph. We could apply this approach to functional brain networks (FBNs) for identifying subgroups of people presenting similar functional connectivity, such as studying a mental disorder. The main problem is that real-world networks present natural fluctuations, which we should consider. Methods In this context, spectral density is an exciting feature because graphs generated by different models present distinct spectral densities, thus presenting different connectivity structures. We introduce two clustering methods: k-means for graphs of the same size and gCEM, a model-based approach for graphs of different sizes. We evaluated their performance in toy models. Finally, we applied them to FBNs of monkeys under anesthesia and a dataset of chemical compounds. Results We show that our methods work well in both toy models and real-world data. They present good results for clustering graphs presenting different connectivity structures even when they present the same number of edges, vertices, and degree of centrality. Discussion We recommend using k-means-based clustering for graphs when graphs present the same number of vertices and the gCEM method when graphs present a different number of vertices.

[1]  P. Erdos,et al.  On random graphs. I. , 2022, Publicationes Mathematicae Debrecen.

[2]  Catherine Matias,et al.  A ug 2 02 1 Spectral density of random graphs : convergence properties and application in model fitting , 2021 .

[3]  A. Sayed,et al.  Revisiting correlation-based functional connectivity and its relationship with structural connectivity , 2020, Network Neuroscience.

[4]  Nils M. Kriege,et al.  A survey on graph kernels , 2019, Applied Network Science.

[5]  François G. Meyer,et al.  Metrics for graph comparison: A practitioner’s guide , 2019, bioRxiv.

[6]  Yong-Yeol Ahn,et al.  CluSim: a python package for calculating clustering similarity , 2019, J. Open Source Softw..

[7]  Emery N. Brown,et al.  Automated Assessment of Loss of Consciousness Using Whisker And Paw Movements During Anesthetic Dosing in Head-Fixed Rodents , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[8]  Richard F. Betzel,et al.  Structure–function relationships during segregated and integrated network states of human brain functional connectivity , 2018, Brain Structure and Function.

[9]  João Ricardo Sato,et al.  Statistical Methods in Graphs: Parameter Estimation, Model Selection, and Hypothesis Test , 2016 .

[10]  Dustin Scheinost,et al.  The (in)stability of functional brain network measures across thresholds , 2015, NeuroImage.

[11]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[12]  Martijn P. van den Heuvel,et al.  The Laplacian spectrum of neural networks , 2014, Front. Comput. Neurosci..

[13]  Zenas C. Chao,et al.  Large-Scale Information Flow in Conscious and Unconscious States: an ECoG Study in Monkeys , 2013, PloS one.

[14]  Cornelis J. Stam,et al.  Structure out of chaos: Functional brain network analysis with EEG, MEG, and functional MRI , 2013, European Neuropsychopharmacology.

[15]  Van H. Vu,et al.  Sparse random graphs: Eigenvalues and eigenvectors , 2010, Random Struct. Algorithms.

[16]  João Ricardo Sato,et al.  Discriminating Different Classes of Biological Networks by Analyzing the Graphs Spectra Distribution , 2012, PloS one.

[17]  Ioana Dumitriu,et al.  Sparse regular random graphs: Spectral density and eigenvectors , 2009, 0910.5306.

[18]  Yasuo Nagasaka,et al.  Multidimensional Recording (MDR) and Data Sharing: An Ecological Open Research and Educational Platform for Neuroscience , 2011, PloS one.

[19]  Karl J. Friston Functional and Effective Connectivity: A Review , 2011, Brain Connect..

[20]  Olaf Sporns,et al.  Complex network measures of brain connectivity: Uses and interpretations , 2010, NeuroImage.

[21]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[22]  Ping Zhu,et al.  A study of graph spectra for comparing graphs and trees , 2008, Pattern Recognit..

[23]  M. Fatih Demirci,et al.  Indexing through laplacian spectra , 2008, Comput. Vis. Image Underst..

[24]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[25]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[26]  Paul Blackwell,et al.  Spectra of adjacency matrices of random geometric graphs , 2006 .

[27]  Robert Tibshirani,et al.  Cluster Validation by Prediction Strength , 2005 .

[28]  O. Sporns,et al.  Organization, development and function of complex brain networks , 2004, Trends in Cognitive Sciences.

[29]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[30]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[31]  A. Barabasi,et al.  Spectra of "real-world" graphs: beyond the semicircle law. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[33]  M. Penrose On k-connectivity for a geometric random graph , 1999, Random Struct. Algorithms.

[34]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[35]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[36]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[37]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[38]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[39]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[40]  B. McKay The expected eigenvalue distribution of a large regular graph , 1981 .

[41]  M. Levandowsky,et al.  Distance between Sets , 1971, Nature.

[42]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[43]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[44]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[45]  E. Nadaraya On Estimating Regression , 1964 .

[46]  Herbert A. Sturges,et al.  The Choice of a Class Interval , 1926 .