A Vertex-centric Markov Chain Algorithm for Network Clustering based on b-Coloring

The massive size and complexity of big datasets such as those coming from social, natural and sensor environments raise utmost challenges to unsupervised cluster analysis methods in terms of performance scalability in designing algorithms, also considering parallel and distributed networking context. To cope with these hindrances, the parallelization of clustering techniques, also benefiting from GPU-centered computation, can contribute to fill the gap in applicative areas such as optimization of network routing or management of large-scale IoT networks, thus enabling the extraction, processing and policy making relying on rich network information that are typically represented in the form of graphs. One established approach to clustering graphs is through the coloring techniques, and indeed, graph clustering and graph coloring can be viewed as tied. We devise a graph clustering technique based on a Markov Chain method aimed at b-coloring the data points, that works in efficient vertex-centric parallel manner and produces a valid clustering with reduced number of color classes. We assess our algorithm against synthetic data encapsulating group structure characteristics and present a brief convergence analysis of the method.

[1]  Walter D. Fisher On Grouping for Maximum Homogeneity , 1958 .

[2]  Alberto Bertoni,et al.  Size Constrained Distance Clustering: Separation Properties and Some Complexity Results , 2012, Fundam. Informaticae.

[3]  Buqing Cao,et al.  Relationship Network Augmented Web Services Clustering , 2019, 2019 IEEE International Conference on Web Services (ICWS).

[4]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Giuliano Grossi,et al.  Robust Single-Sample Face Recognition by Sparsity-Driven Sub-Dictionary Learning Using Deep Features † , 2019, Sensors.

[7]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[8]  Kari Laasonen,et al.  Clustering and Prediction of Mobile User Routes from Cellular Data , 2005, PKDD.

[9]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[10]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[11]  Michael R. Lyu,et al.  Trust- and clustering-based authentication services in mobile ad hoc networks , 2004, 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings..

[12]  Andrea Vattani,et al.  k-means Requires Exponentially Many Iterations Even in the Plane , 2008, SCG '09.

[13]  Tetsuya Yoshida,et al.  Toward Improving Re-coloring Based Clustering with Graph b-Coloring , 2010, PRICAI.

[14]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[15]  Giuliano Grossi,et al.  Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation , 2017, PloS one.

[16]  Paula Brito,et al.  A partitional clustering algorithm validated by a clustering tendency index based on graph theory , 2006, Pattern Recognit..

[17]  Hamamache Kheddouci,et al.  A Graph b-coloring Framework for Data Clustering , 2008, J. Math. Model. Algorithms.

[18]  Arthur L. Liestman,et al.  Approximating minimum size weakly-connected dominating sets for clustering mobile ad hoc networks , 2002, MobiHoc '02.

[19]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[20]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[21]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[22]  Alessandro Adamo,et al.  Robust face recognition using sparse representation in LDA space , 2015, Machine Vision and Applications.

[23]  David Manlove,et al.  The b-chromatic Number of a Graph , 1999, Discret. Appl. Math..

[24]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs , 2012, IEEE Transactions on Signal Processing.