Compressive graph clustering from random sketches

Graph clustering, where the goal is to cluster the nodes in a graph into disjoint clusters, arises from applications such as community detection, network monitoring, and bioinformatics. This paper describes an approach for graph clustering based on a small number of linear measurements, i.e. sketches, of the adjacency matrix, where each sketch corresponds to the number of edges in a randomly selected subgraph. Under the stochastic block model, we propose a computationally tractable algorithm based on semidefinite programming to recover the underlying clustering structure, by motivating the low-dimensional parsimonious structure of the clustering matrix. Numerical examples are presented to validate the excellent performance of the proposed algorithm, which allows exact recovery of the clustering matrix under favorable trade-offs between the number of sketches and the edge density gap under the stochastic block model.

[1]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[2]  Andrea J. Goldsmith,et al.  Exact and Stable Covariance Estimation From Quadratic Sampling via Convex Programming , 2013, IEEE Transactions on Information Theory.

[3]  Andrea J. Goldsmith,et al.  Estimation of simultaneously structured covariance matrices from quadratic measurements , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Parikshit Shah,et al.  Sketching Sparse Matrices , 2013, ArXiv.

[5]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[6]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[7]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[8]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[9]  Ramesh Govindan,et al.  Detection and identification of network anomalies using sketch subspaces , 2006, IMC '06.

[10]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[11]  Andrea J. Goldsmith,et al.  Robust and universal covariance estimation from quadratic measurements via convex programming , 2014, 2014 IEEE International Symposium on Information Theory.

[12]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[13]  Alexandre Proutière,et al.  Community Detection via Random and Adaptive Sampling , 2014, COLT.

[14]  Sudipto Guha,et al.  Graph sketches: sparsification, spanners, and subgraphs , 2012, PODS.

[15]  Vladimir Grebinski,et al.  Optimal Reconstruction of Graphs under the Additive Model , 1997, Algorithmica.

[16]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[17]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[18]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[19]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..