Semidefinite Programming for Community Detection With Side Information

This paper produces an efficient semidefinite programming (SDP) solution for community detection that incorporates non-graph data, which in this context is known as side information. SDP is an efficient solution for standard community detection on graphs. We formulate a semi-definite relaxation for the maximum likelihood estimation of node labels, subject to observing both graph and non-graph data. This formulation is distinct from the SDP solution of standard community detection, but maintains its desirable properties. We calculate the exact recovery threshold for three types of non-graph information, which in this paper are called side information: partially revealed labels, noisy labels, as well as multiple observations (features) per node with arbitrary but finite cardinality. We find that SDP has the same exact recovery threshold in the presence of side information as maximum likelihood with side information. Thus, the methods developed herein are computationally efficient as well as asymptotically accurate for the solution of community detection in the presence of side information. Simulations show that the asymptotic results of this paper can also shed light on the performance of SDP for graphs of modest size.

[1]  Aria Nosratinia,et al.  EXIT analysis for belief propagation in degree-correlated stochastic block models , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[2]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[3]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[4]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[5]  Alexandre Proutière,et al.  Community Detection via Random and Adaptive Sampling , 2014, COLT.

[6]  Keke Huang,et al.  Incorporating Latent Constraints to Enhance Inference of Network Structure , 2020, IEEE Transactions on Network Science and Engineering.

[7]  Wei Yu,et al.  An introduction to convex optimization for communications and signal processing , 2006, IEEE Journal on Selected Areas in Communications.

[8]  Aria Nosratinia,et al.  Community Detection with Secondary Latent Variables , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[9]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[10]  Aria Nosratinia,et al.  Recovering a Single Community With Side Information , 2018, IEEE Transactions on Information Theory.

[11]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[12]  Bruce E. Hajek,et al.  Exact recovery threshold in the binary censored block model , 2015, 2015 IEEE Information Theory Workshop - Fall (ITW).

[13]  Alan M. Frieze,et al.  Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION , 1995, IPCO.

[14]  Alan M. Frieze,et al.  Improved approximation algorithms for MAXk-CUT and MAX BISECTION , 1995, Algorithmica.

[15]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[16]  Aria Nosratinia,et al.  Side Information in Recovering a Single Community: Information Theoretic Limits , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[17]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[18]  Aria Nosratinia,et al.  Community Detection With Side Information: Exact Recovery Under the Stochastic Block Model , 2018, IEEE Journal of Selected Topics in Signal Processing.

[19]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions , 2015, IEEE Transactions on Information Theory.

[20]  Elchanan Mossel,et al.  Density Evolution in the Degree-correlated Stochastic Block Model , 2015, COLT.

[21]  Keke Huang,et al.  Sparse Bayesian learning for network structure reconstruction based on evolutionary game data , 2020 .

[22]  Aria Nosratinia,et al.  Exact Recovery by Semidefinite Programming in the Binary Stochastic Block Model with Partially Revealed Side Information , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  R. Srikant,et al.  Jointly clustering rows and columns of binary matrices: algorithms and trade-offs , 2013, SIGMETRICS '14.

[24]  Xuelong Li,et al.  Evolutionary Markov Dynamics for Network Community Detection , 2022, IEEE Transactions on Knowledge and Data Engineering.

[25]  Aria Nosratinia,et al.  Belief Propagation with Side Information for Recovering a Single Community , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[26]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[27]  Aria Nosratinia,et al.  Community Detection with Side Information via Semidefinite Programming , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[28]  Elchanan Mossel,et al.  Belief propagation, robust reconstruction and optimal recovery of block models , 2013, COLT.

[29]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[30]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[32]  Aria Nosratinia,et al.  Community Detection: Exact Recovery in Weighted Graphs , 2021, ArXiv.

[33]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[34]  Elchanan Mossel,et al.  Local Algorithms for Block Models with Side Information , 2015, ITCS.

[35]  Florent Krzakala,et al.  Spectral detection in the censored block model , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[36]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[37]  Aria Nosratinia,et al.  Exact recovery in the binary stochastic block model with binary side information , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[38]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.