Submatrix localization via message passing

The principal submatrix localization problem deals with recovering a $K\times K$ principal submatrix of elevated mean $\mu$ in a large $n\times n$ symmetric matrix subject to additive standard Gaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime $\Omega(\sqrt{n}) \leq K \leq o(n)$, the support of the submatrix can be weakly recovered (with $o(K)$ misclassification errors on average) by an optimized message passing algorithm if $\lambda = \mu^2K^2/n$, the signal-to-noise ratio, exceeds $1/e$. This extends a result by Deshpande and Montanari previously obtained for $K=\Theta(\sqrt{n}).$ In addition, the algorithm can be extended to provide exact recovery whenever information-theoretically possible and achieve the information limit of exact recovery as long as $K \geq \frac{n}{\log n} (\frac{1}{8e} + o(1))$. The total running time of the algorithm is $O(n^2\log n)$. Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a $K_1\times K_2$ submatrix of elevated mean $\mu$ in a large $n_1\times n_2$ Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming $\Omega(\sqrt{n_i}) \leq K_i \leq o(n_i)$ and $K_1\asymp K_2.$ A sharp information-theoretic condition for the weak recovery of both clusters is also identified.

[1]  Kai Lai Chung,et al.  A Course in Probability Theory , 1949 .

[2]  L. Goddard Information Theory , 1962, Nature.

[3]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[5]  P. Wedin Perturbation bounds in connection with singular value decomposition , 1972 .

[6]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[7]  V. V. Petrov Limit Theorems of Probability Theory: Sequences of Independent Random Variables , 1995 .

[8]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[9]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[10]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[11]  Arieh Iserles,et al.  On the Foundations of Computational Mathematics , 2001 .

[12]  Rene F. Swarttouw,et al.  Orthogonal polynomials , 2020, NIST Handbook of Mathematical Functions.

[13]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[14]  A. Nobel,et al.  Finding large average submatrices in high dimensional data , 2009, 0905.1682.

[15]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[16]  Jun Yin,et al.  The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[17]  Herbert A. David,et al.  Order Statistics , 2011, International Encyclopedia of Statistical Science.

[18]  Sivaraman Balakrishnan,et al.  Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[19]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[20]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[21]  Yu. I. Ingster,et al.  Detection of a sparse submatrix of a high-dimensional noisy matrix , 2011, 1109.0898.

[22]  Yihong Wu,et al.  Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[23]  Yu. I. Ingster,et al.  Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix , 2013, 1303.5647.

[24]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.

[25]  Yair Weiss,et al.  Belief Propagation , 2012, Encyclopedia of Social Network Analysis and Mining.

[26]  Elchanan Mossel,et al.  Belief propagation, robust reconstruction and optimal recovery of block models , 2013, COLT.

[27]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[28]  T. Cai,et al.  Optimal estimation and rank detection for sparse spiked covariance matrices , 2013, Probability theory and related fields.

[29]  Tengyuan Liang,et al.  Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix , 2015, 1502.01988.

[30]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[31]  Bruce E. Hajek,et al.  Semidefinite Programs for Exact Recovery of a Hidden Community , 2016, COLT.

[32]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[33]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[34]  Bruce Hajek,et al.  Information limits for recovering a hidden community , 2015, 2016 IEEE International Symposium on Information Theory (ISIT).

[35]  Andrea Montanari,et al.  On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors , 2014, IEEE Transactions on Information Theory.