Entropic Spectral Learning in Large Scale Networks

We present a novel algorithm for learning the spectral density of large scale networks using stochastic trace estimation and the method of maximum entropy. The complexity of the algorithm is linear in the number of non-zero elements of the matrix, offering a computational advantage over other algorithms. We apply our algorithm to the problem of community detection in large networks. We show state-of-the-art performance on both synthetic and real datasets.

[1]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[2]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[3]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[4]  Masaru Kitsuregawa,et al.  A Graph Based Approach to Extract a Neighborhood Customer Community for Collaborative Filtering , 2002, DNIS.

[5]  G. Caldarelli,et al.  Detecting communities in large networks , 2004, cond-mat/0402499.

[6]  Yousef Saad,et al.  Applications of Trace Estimation Techniques , 2017, HPCSE.

[7]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[8]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[9]  Yousef Saad,et al.  Approximating Spectral Densities of Large Matrices , 2013, SIAM Rev..

[10]  Yousef Saad,et al.  Fast Estimation of tr(f(A)) via Stochastic Lanczos Quadrature , 2017, SIAM J. Matrix Anal. Appl..

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Sean A. Ali,et al.  Application of the maximum relative entropy method to the physics of ferromagnetic materials , 2016, 1603.00068.

[13]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[14]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[15]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[16]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[17]  Stephen J. Roberts,et al.  Entropic determinants of massive matrices , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[18]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[19]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[20]  P. Buchen,et al.  The Maximum Entropy Distribution of an Asset Inferred from Option Prices , 1996, Journal of Financial and Quantitative Analysis.

[21]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[22]  Ariel Caticha Maximum entropy, fluctuations and priors , 2001 .

[23]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[24]  Christos Faloutsos,et al.  Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation , 2011, PAKDD.

[25]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[26]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[27]  J. Baik,et al.  The Oxford Handbook of Random Matrix Theory , 2011 .

[28]  YuanBo,et al.  Detecting functional modules in the yeast protein--protein interaction network , 2006 .

[29]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[30]  Cassio Neri,et al.  Maximum entropy distributions inferred from option portfolios on an asset , 2012, Finance Stochastics.

[31]  Stephen J. Roberts,et al.  Entropic Trace Estimates for Log Determinants , 2017, ECML/PKDD.

[32]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[33]  K. Dill,et al.  Principles of maximum entropy and maximum caliber in statistical physics , 2013 .

[34]  Michael Menzinger,et al.  Laplacian spectra as a diagnostic tool for network structure and dynamics. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Rodney W. Johnson,et al.  Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy , 1980, IEEE Trans. Inf. Theory.

[36]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Balachander Krishnamurthy,et al.  On network-aware clustering of Web clients , 2000, SIGCOMM.

[38]  W. Haemers,et al.  Which graphs are determined by their spectrum , 2003 .

[39]  Jinwoo Shin,et al.  Large-scale log-determinant computation through stochastic Chebyshev expansions , 2015, ICML.

[40]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[42]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.