Compressive Network Analysis

Modern data acquisition routinely produces massive amounts of network data. Though many methods and models have been proposed to analyze such data, the research of network data is largely disconnected with the classical theory of statistical learning and signal processing. In this paper, we present a new framework for modeling network data, which connects two seemingly different areas: network data analysis and compressed sensing. From a nonparametric perspective, we model an observed network using a large dictionary. In particular, we consider the network clique detection problem and show connections between our formulation with a new algebraic tool, namely Randon basis pursuit in homogeneous spaces. Such a connection allows us to identify rigorous recovery conditions for clique detection problems. Though this paper is mainly conceptual, we also develop practical approximation algorithms for solving empirical problems and demonstrate their usefulness on real-world datasets.

[1]  George B. Dantzig,et al.  Decomposition Principle for Linear Programs , 1960 .

[2]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[3]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[4]  George S. Lueker,et al.  Maximization problems on graphs with edge weights chosen from a normal distribution (Extended Abstract) , 1978, STOC.

[5]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[6]  B. Bollobás The evolution of random graphs , 1984 .

[7]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[8]  S. Wasserman,et al.  Stochastic a posteriori blockmodels: Construction and assessment , 1987 .

[9]  P. Diaconis Group representations in probability and statistics , 1988 .

[10]  G. Grimmett,et al.  Disorder in physical systems : a volume in honour of John M. Hammersley on the occasion of his 70th birthday , 1990 .

[11]  Donald E. Knuth,et al.  The Stanford GraphBase - a platform for combinatorial computing , 1993 .

[12]  Tom A. B. Snijders,et al.  Methods for longitudinal social network data: Review and Markov process models , 1995 .

[13]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[17]  Yinyu Ye,et al.  Interior point algorithms: theory and analysis , 1997 .

[18]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[19]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[20]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[21]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[22]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[23]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[25]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[26]  John E. Mitchell,et al.  Polynomial Interior Point Cutting Plane Methods , 2003, Optim. Methods Softw..

[27]  Jean-Jacques Fuchs,et al.  On sparse representations in arbitrary redundant bases , 2004, IEEE Transactions on Information Theory.

[28]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[29]  T. Snijders,et al.  p2: a random effects model with covariates for directed graphs , 2004 .

[30]  T. Snijders Models for longitudinal network datain , 2005 .

[31]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[32]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[33]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[34]  K. Kaski,et al.  Intensity and coherence of motifs in weighted complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Wolfgang Lindenberg,et al.  Clique detection for nondirected graphs: Two new algorithms , 1979, Computing.

[36]  M. Yuan,et al.  On the Nonnegative Garrote Estimator , 2005 .

[37]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[39]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[40]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[41]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[42]  M. Yuan,et al.  On the non‐negative garrotte estimator , 2007 .

[43]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[44]  Devavrat Shah,et al.  Inferring rankings under constrained sensing , 2008, NIPS.

[45]  E. Candès The restricted isometry property and its implications for compressed sensing , 2008 .

[46]  Leonidas J. Guibas,et al.  The identity management problem — A short survey , 2008, 2008 11th International Conference on Information Fusion.

[47]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[48]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[49]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[50]  Jian-Feng Cai,et al.  Linearized Bregman iterations for compressed sensing , 2009, Math. Comput..

[51]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  Robin Palotai,et al.  Community Landscapes: An Integrative Approach to Determine Overlapping Network Module Hierarchy, Identify Key Nodes and Predict Network Dynamics , 2009, PloS one.

[53]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[54]  Wei Chen,et al.  A game-theoretic framework to identify overlapping communities in social networks , 2010, Data Mining and Knowledge Discovery.

[55]  Andrea Montanari,et al.  Finding Hidden Cliques of Size \sqrt{N/e} in Nearly Linear Time , 2013, ArXiv.

[56]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[57]  Andrea Montanari,et al.  Finding Hidden Cliques of Size $$\sqrt{N/e}$$N/e in Nearly Linear Time , 2013, Found. Comput. Math..

[58]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.