Testing thresholds for high-dimensional sparse random geometric graphs

The random geometric graph model Geod(n, p) is a distribution over graphs in which the edges capture a latent geometry. To sample G ∼ Geod(n, p), we identify each of our n vertices with an independently and uniformly sampled vector from the d-dimensional unit sphere Sd−1, and we connect pairs of vertices whose vectors are “sufficiently close,” such that the marginal probability of an edge is p. Because of the underlying geometry, this model is natural for applications in data science and beyond. We investigate the problem of testing for this latent geometry, or in other words, distinguishing an Erdős-Rényi graph G(n, p) from a random geometric graph Geod(n, p). It is not too difficult to show that if d →∞ while n is held fixed, the two distributions become indistinguishable; we wish to understand how fast d must grow as a function of n for indistinguishability to occur. When p = α n for constant α, we prove that if d Ê polylog(n), the total variation distance between the two distributions is close to 0; this improves upon the best previous bound of Brennan, Bresler, and Nagaraj (2020), which required d ≫ n, and further our result is nearly tight, resolving a conjecture of Bubeck, Ding, Eldan, & Rácz (2016) up to logarithmic factors. We also obtain improved upper bounds on the statistical indistinguishability thresholds in d for the full range of p satisfying 1 n É pÉ 1 2 , improving upon the previous bounds by polynomial factors. Our analysis uses the Belief Propagation algorithm to characterize the distributions of (subsets of) the random vectors conditioned on producing a particular graph. In this sense, our analysis is connected to the “cavity method” from statistical physics. To analyze this process, we rely on novel sharp estimates for the area of the intersection of a random sphere cap with an arbitrary subset of Sd−1, which we prove using optimal transport maps and entropy-transport inequalities on the unit sphere. We believe these techniques may be of independent interest. *UC Berkeley. sliu18@berkeley.edu. Supported in part by the Berkeley Haas Blockchain Initiative and a donation from the Ethereum Foundation. †UC Berkeley. sidhanthm@cs.berkeley.edu. Supported by a Google PhD Fellowship. ‡Stanford University. tselil@stanford.edu. §UC Berkeley. elizabeth_yang@berkeley.edu. Supported by the NSF GRFP under Grant No. DGE 1752814.

[1]  J. Dall,et al.  Random geometric graphs. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Madhur Tulsiani,et al.  A characterization of strong approximation resistance , 2013, Electron. Colloquium Comput. Complex..

[3]  M. Ledoux,et al.  Analysis and Geometry of Markov Diffusion Operators , 2013 .

[4]  D. Panchenko The Sherrington-Kirkpatrick Model , 2013 .

[5]  Uriel Feige,et al.  On the optimality of the random hyperplane rounding technique for MAX CUT , 2002, Random Struct. Algorithms.

[6]  H. Nishimori Internal Energy, Specific Heat and Correlation Function of the Bond-Random Ising Model , 1981 .

[7]  Guy Bresler,et al.  De Finetti-Style Results for Wishart Matrices: Combinatorial Structure and Phase Transitions , 2021, ArXiv.

[8]  Grant Schoenebeck,et al.  Linear Level Lasserre Lower Bounds for Certain k-CSPs , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[9]  M. Mézard,et al.  The Cavity Method at Zero Temperature , 2002, cond-mat/0207121.

[10]  Danning Li,et al.  Approximation of Rectangular Beta-Laguerre Ensembles and Large Deviations , 2013, 1309.3882.

[11]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[12]  Florent Krzakala,et al.  Information-theoretic thresholds from the cavity method , 2016, STOC.

[13]  Aditya Bhaskara,et al.  Detecting high log-densities: an O(n¼) approximation for densest k-subgraph , 2010, STOC '10.

[14]  Dima Grigoriev,et al.  Linear lower bound on degrees of Positivstellensatz calculus proofs for the parity , 2001, Theor. Comput. Sci..

[15]  Daniel Cullina,et al.  Improved Achievability and Converse Bounds for Erdos-Renyi Graph Matching , 2016, SIGMETRICS.

[16]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[17]  Sébastien Bubeck,et al.  Testing for high‐dimensional geometry in random graphs , 2014, Random Struct. Algorithms.

[18]  Miklós Simonovits,et al.  Deterministic and randomized polynomial-time approximation of radii , 2001 .

[19]  R. Handel Probability in High Dimension , 2014 .

[20]  C. Villani Optimal Transport: Old and New , 2008 .

[21]  G. Lugosi,et al.  High-dimensional random geometric graphs and their clique number , 2011 .

[22]  S. Kak Information, physics, and computation , 1996 .

[23]  Martin E. Dyer,et al.  Mixing in time and space for lattice spin systems: A combinatorial view , 2002, RANDOM.

[24]  Jiri Matousek,et al.  Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[25]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[26]  J. Dolbeault,et al.  Sharp Interpolation Inequalities on the Sphere: New Methods and Consequences , 2012, 1210.1853.

[27]  M. Talagrand Spin glasses : a challenge for mathematicians : cavity and mean field models , 2003 .

[28]  Guy Bresler,et al.  Phase Transitions for Detecting Latent Geometry in Random Graphs , 2019, ArXiv.

[29]  Uriel Feige,et al.  Relations between average case complexity and approximation complexity , 2002, STOC '02.

[30]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[31]  Benjamin Rossman,et al.  On the constant-depth complexity of k-clique , 2008, STOC.

[32]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[33]  Dror Weitz,et al.  Counting independent sets up to the tree threshold , 2006, STOC '06.

[34]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[35]  Dmitriy Katz,et al.  Strong spatial mixing of list coloring of graphs , 2012, Random Struct. Algorithms.

[36]  Michael Langberg,et al.  Graphs with tiny vector chromatic numbers and huge chromatic numbers , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[37]  Cavity method in the spherical SK model , 2006 .

[38]  Miklós Z. Rácz,et al.  A Smooth Transition from Wishart to GOE , 2016, 1611.05838.

[39]  Allan Sly,et al.  Communications in Mathematical Physics The Replica Symmetric Solution for Potts Models on d-Regular Graphs , 2022 .

[40]  M. Talagrand Transportation cost for Gaussian and other product measures , 1996 .

[41]  Miklós Z. Rácz,et al.  Phase transition in noisy high-dimensional random geometric graphs , 2021, Electronic Journal of Statistics.

[42]  Ronen Eldan,et al.  Information and dimensionality of anisotropic random geometric graphs , 2016, ArXiv.

[43]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[44]  Sébastien Bubeck,et al.  Entropic CLT and phase transition in high-dimensional Wishart matrices , 2015, ArXiv.