Graph powering and spectral robustness

Spectral algorithms, such as principal component analysis and spectral clustering, typically require careful data transformations to be effective: upon observing a matrix $A$, one may look at the spectrum of $\psi(A)$ for a properly chosen $\psi$. The issue is that the spectrum of $A$ might be contaminated by non-informational top eigenvalues, e.g., due to scale` variations in the data, and the application of $\psi$ aims to remove these. Designing a good functional $\psi$ (and establishing what good means) is often challenging and model dependent. This paper proposes a simple and generic construction for sparse graphs, $$\psi(A) = \1((I+A)^r \ge1),$$ where $A$ denotes the adjacency matrix and $r$ is an integer (less than the graph diameter). This produces a graph connecting vertices from the original graph that are within distance $r$, and is referred to as graph powering. It is shown that graph powering regularizes the graph and decontaminates its spectrum in the following sense: (i) If the graph is drawn from the sparse Erd\H{o}s-R\'enyi ensemble, which has no spectral gap, it is shown that graph powering produces a `maximal' spectral gap, with the latter justified by establishing an Alon-Boppana result for powered graphs; (ii) If the graph is drawn from the sparse SBM, graph powering is shown to achieve the fundamental limit for weak recovery (the KS threshold) similarly to \cite{massoulie-STOC}, settling an open problem therein. Further, graph powering is shown to be significantly more robust to tangles and cliques than previous spectral algorithms based on self-avoiding or nonbacktracking walk counts \cite{massoulie-STOC,Mossel_SBM2,bordenave,colin3}. This is illustrated on a geometric block model that is dense in cliques.

[1]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[2]  Yoshiyuki Kabashima,et al.  Limitations in the spectral method for graph partitioning: detectability threshold and localization of eigenvectors , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[4]  Nikhil Srivastava,et al.  Graph sparsification by effective resistances , 2008, SIAM J. Comput..

[5]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[6]  Lap Chi Lau,et al.  Lower Bounds on Expansions of Graph Powers , 2014, APPROX-RANDOM.

[7]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[8]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[9]  Can M. Le,et al.  Sparse random graphs: regularization and concentration of the Laplacian , 2015, ArXiv.

[10]  Richard Peng,et al.  An efficient parallel solver for SDD linear systems , 2013, STOC.

[11]  Chang-Long Yao,et al.  Large deviations for the graph distance in supercritical continuum percolation , 2011 .

[12]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[13]  Pramod Viswanath,et al.  All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.

[14]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[15]  Omer Reingold,et al.  Derandomization Beyond Connectivity: Undirected Laplacian Systems in Nearly Logarithmic Space , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[17]  Yu Cheng,et al.  Spectral Sparsification of Random-Walk Matrix Polynomials , 2015, ArXiv.

[18]  François Baccelli,et al.  Community detection on euclidean random graphs , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[19]  M. Newman,et al.  Robustness of community structure in networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Pravesh Kothari,et al.  Robust moment estimation and improved clustering via sum of squares , 2018, STOC.

[21]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[22]  J. Friedman Some geometric aspects of graphs and their eigenfunctions , 1993 .

[23]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[24]  Audry Terras What are zeta functions of graphs and what are they good for ? , 2005 .

[25]  van Vu,et al.  A Simple SVD Algorithm for Finding Hidden Partitions , 2014, Combinatorics, Probability and Computing.

[26]  Andrea Montanari,et al.  Fundamental Limits of Weak Recovery with Applications to Phase Retrieval , 2017, COLT.

[27]  Bin Yu,et al.  Impact of regularization on spectral clustering , 2013, 2014 Information Theory and Applications Workshop (ITA).

[28]  Sivaraman Balakrishnan,et al.  Noise Thresholds for Spectral Clustering , 2011, NIPS.

[29]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[30]  Laurent Massoulié,et al.  Group Synchronization on Grids , 2017, Mathematical Statistics and Learning.

[31]  Emmanuel Abbe,et al.  Proof of the Achievability Conjectures for the General Stochastic Block Model , 2018 .

[32]  Adel Javanmard,et al.  Performance of a community detection algorithm based on semidefinite programming , 2016, ArXiv.

[33]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[34]  Arya Mazumdar,et al.  The Geometric Block Model , 2017, AAAI.

[35]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[36]  AbbeEmmanuel Community detection and stochastic block models , 2017 .

[37]  Noga Alon,et al.  On the second eigenvalue of a graph , 1991, Discret. Math..

[38]  Zhenguo Li,et al.  Noise Robust Spectral Clustering , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[39]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[40]  Ankur Moitra,et al.  How robust are reconstruction thresholds for community detection? , 2015, STOC.

[41]  H. Weyl Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung) , 1912 .

[42]  N. Linial,et al.  Expander Graphs and their Applications , 2006 .