Statistical Problems with Planted Structures: Information-Theoretical and Computational Limits

Over the past few years, insights from computer science, statistical physics, and information theory have revealed phase transitions in a wide array of high-dimensional statistical problems at two distinct thresholds: One is the information-theoretical (IT) threshold below which the observation is too noisy so that inference of the ground truth structure is impossible regardless of the computational cost; the other is the computational threshold above which inference can be performed efficiently, i.e., in time that is polynomial in the input size. In the intermediate regime, inference is information-theoretically possible, but conjectured to be computationally hard. This article provides a survey of the common techniques for determining the sharp IT and computational limits, using community detection and submatrix detection as illustrating examples. For IT limits, we discuss tools including the first and second moment method for analyzing the maximal likelihood estimator, information-theoretic methods for proving impossibility results using rate-distortion theory, and methods originated from statistical physics such as interpolation method. To investigate computational limits, we describe a common recipe to construct a randomized polynomial-time reduction scheme that approximately maps instances of the planted clique problem to the problem of interest in total variation distance.

[1]  Yihong Wu,et al.  Dissipation of Information in Channels With Input Constraints , 2014, IEEE Transactions on Information Theory.

[2]  Ludek Kucera,et al.  A Generalized Encryption Scheme Based on Random Graphs , 1991, WG.

[3]  Harrison H. Zhou,et al.  Sparse CCA: Adaptive Estimation and Computational Barriers , 2014, 1409.8565.

[4]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[5]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[6]  Tengyuan Liang,et al.  Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix , 2015, 1502.01988.

[7]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[8]  Avi Wigderson,et al.  Sum-of-squares Lower Bounds for Planted Clique , 2015, STOC.

[9]  B. Harshbarger An Introduction to Probability Theory and its Applications, Volume I , 1958 .

[10]  Ari Juels,et al.  Hiding Cliques for Cryptographic Security , 1998, SODA '98.

[11]  Yu. I. Ingster,et al.  Detection of a sparse submatrix of a high-dimensional noisy matrix , 2011, 1109.0898.

[12]  L. Brown,et al.  Information Inequality Bounds on the Minimax Risk (with an Application to Nonparametric Regression) , 1991 .

[13]  Andrea Montanari,et al.  Asymptotic Mutual Information for the Two-Groups Stochastic Block Model , 2015, ArXiv.

[14]  Yuhong Yang,et al.  Information-theoretic determination of minimax rates of convergence , 1999 .

[15]  Elchanan Mossel,et al.  Consistency thresholds for the planted bisection model , 2016 .

[16]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[17]  Michael I. Jordan,et al.  Finite Size Corrections and Likelihood Ratio Fluctuations in the Spiked Wigner Model , 2017, ArXiv.

[18]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[19]  Stephen A. Vavasis,et al.  Nuclear norm minimization for the planted clique and biclique problems , 2009, Math. Program..

[20]  Jess Banks,et al.  Information-theoretic thresholds for community detection in sparse networks , 2016, COLT.

[21]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[22]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[23]  E. Arias-Castro,et al.  Community Detection in Sparse Random Networks , 2013, 1308.2955.

[24]  Quentin Berthet,et al.  Statistical and computational trade-offs in estimation of sparse principal components , 2014, 1408.5369.

[25]  Jiaming Xu,et al.  Rates of Convergence of Spectral Methods for Graphon Estimation , 2017, ICML.

[26]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization , 2016, ArXiv.

[27]  I. Vajda Theory of statistical inference and information , 1989 .

[28]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[29]  Wasim Huleihel,et al.  Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure , 2018, COLT.

[30]  Afonso S. Bandeira,et al.  Statistical limits of spiked tensor models , 2016, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[31]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[32]  Pascal Koiran,et al.  On the Certification of the Restricted Isometry Property , 2011, ArXiv.

[33]  U. Feige,et al.  Finding hidden cliques in linear time , 2009 .

[34]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[35]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[36]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Detecting Planted Cliques , 2012, J. ACM.

[37]  Andrea Montanari,et al.  Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems , 2015, COLT.

[38]  Jess Banks,et al.  Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[39]  Pravesh Kothari,et al.  SoS and Planted Clique: Tight Analysis of MPW Moments at all Degrees and an Optimal Lower Bound at Degree Four , 2015, ArXiv.

[40]  S. Péché The largest eigenvalue of small rank perturbations of Hermitian random matrices , 2004, math/0411487.

[41]  N. Verzelen,et al.  Optimal graphon estimation in cut distance , 2017, Probability Theory and Related Fields.

[42]  Yihong Wu,et al.  Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[43]  Marc Lelarge,et al.  Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.

[44]  Jonathan Shi,et al.  Tensor principal component analysis via sum-of-square proofs , 2015, COLT.

[45]  Bruce E. Hajek,et al.  Submatrix localization via message passing , 2015, J. Mach. Learn. Res..

[46]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[47]  Avi Wigderson,et al.  Public-key cryptography from different assumptions , 2010, STOC '10.

[48]  Varun Jog,et al.  Information-theoretic bounds for exact recovery in weighted stochastic block models using the Renyi divergence , 2015, ArXiv.

[49]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[50]  Bruce E. Hajek,et al.  Semidefinite Programs for Exact Recovery of a Hidden Community , 2016, COLT.

[51]  Bruce E. Hajek,et al.  Recovering a hidden community beyond the Kesten–Stigum threshold in O(|E|log*|V|) time , 2015, Journal of Applied Probability.

[52]  Robert Krauthgamer,et al.  How hard is it to approximate the best Nash equilibrium? , 2009, SODA.

[53]  Alexandre Proutière,et al.  Community Detection via Random and Adaptive Sampling , 2014, COLT.

[54]  Robert Krauthgamer,et al.  Finding and certifying a large hidden clique in a semirandom graph , 2000, Random Struct. Algorithms.

[55]  W. Kozakiewicz On the Convergence of Sequences of Moment Generating Functions , 1947 .

[56]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[57]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[58]  Bruce E. Hajek,et al.  Computational Lower Bounds for Community Detection on Random Graphs , 2014, COLT.

[59]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[60]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[61]  Florent Krzakala,et al.  Mutual information in rank-one matrix estimation , 2016, 2016 IEEE Information Theory Workshop (ITW).

[62]  Noga Alon,et al.  Testing k-wise and almost k-wise independence , 2007, STOC '07.

[63]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[64]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[65]  Sivaraman Balakrishnan,et al.  Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[66]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[67]  Yihong Wu,et al.  Application of information-percolation method to reconstruction problems on graphs , 2018, ArXiv.

[68]  A. Nobel,et al.  Finding large average submatrices in high dimensional data , 2009, 0905.1682.

[69]  Florent Krzakala,et al.  Estimation in the Spiked Wigner Model: A Short Proof of the Replica Formula , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[70]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[71]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[72]  Pravesh Kothari,et al.  A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[73]  Igor Vajda,et al.  On Metric Divergences of Probability Measures , 2009, Kybernetika.

[74]  Frank E. Grubbs,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[75]  Harrison H. Zhou,et al.  Rate-optimal graphon estimation , 2014, 1410.5837.

[76]  Lucien Birgé Approximation dans les espaces métriques et théorie de l'estimation , 1983 .

[77]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[78]  Jess Banks,et al.  Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).

[79]  Yu. I. Ingster,et al.  Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix , 2013, 1303.5647.

[80]  Bruce Hajek,et al.  Information limits for recovering a hidden community , 2015, 2016 IEEE International Symposium on Information Theory (ISIT).

[81]  Emmanuel Abbe,et al.  An Information-Percolation Bound for Spin Synchronization on General Graphs , 2018, ArXiv.

[82]  Andrea Montanari,et al.  Extremal Cuts of Sparse Random Graphs , 2015, ArXiv.

[83]  Bruce E. Hajek,et al.  Recovering a Hidden Community Beyond the Spectral Limit in O(|E|log*|V|) Time , 2015, ArXiv.

[84]  Anru R. Zhang,et al.  Tensor SVD: Statistical and Computational Limits , 2017, IEEE Transactions on Information Theory.

[85]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[86]  Florent Krzakala,et al.  Statistical and computational phase transitions in spiked tensor estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[87]  Andrea Montanari,et al.  Information-theoretically optimal sparse PCA , 2014, 2014 IEEE International Symposium on Information Theory.

[88]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[89]  Andrea Montanari,et al.  On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors , 2014, IEEE Transactions on Information Theory.

[90]  Florent Krzakala,et al.  Phase transitions in sparse PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[91]  Benjamin Rossman,et al.  Average-case complexity of detecting cliques , 2010 .

[92]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[93]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[94]  Sanjeev Arora,et al.  Inapproximabilty of Densest κ-Subgraph from Average Case Hardness , 2011 .

[95]  Prasad Raghavendra,et al.  Tight Lower Bounds for Planted Clique in the Degree-4 SOS Program , 2015, ArXiv.

[96]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[97]  C. Pouet Nonparametric Goodness-of-Fit Testing Under Gaussian Models , 2004 .