Thermodynamics of the Minimum Description Length on Community Detection

Modern statistical modeling is an important complement to the more traditional approach of physics where Complex Systems are studied by means of extremely simple idealized models. The Minimum Description Length (MDL) is a principled approach to statistical modeling combining Occam's razor with Information Theory for the selection of models providing the most concise descriptions. In this work, we introduce the Boltzmannian MDL (BMDL), a formalization of the principle of MDL with a parametric complexity conveniently formulated as the free-energy of an artificial thermodynamic system. In this way, we leverage on the rich theoretical and technical background of statistical mechanics, to show the crucial importance that phase transitions and other thermodynamic concepts have on the problem of statistical modeling from an information theoretic point of view. For example, we provide information theoretic justifications of why a high-temperature series expansion can be used to compute systematic approximations of the BMDL when the formalism is used to model data, and why statistically significant model selections can be identified with ordered phases when the BMDL is used to model models. To test the introduced formalism, we compute approximations of BMDL for the problem of community detection in complex networks, where we obtain a principled MDL derivation of the Girvan-Newman (GN) modularity and the Zhang-Moore (ZM) community detection method. Here, by means of analytical estimations and numerical experiments on synthetic and empirical networks, we find that BMDL-based correction terms of the GN modularity improve the quality of the detected communities and we also find an information theoretic justification of why the ZM criterion for estimation of the number of network communities is better than alternative approaches such as the bare minimization of a free energy.

[1]  P. Grünwald,et al.  An empirical study of minimum description length model selection with infinite parametric complexity , 2006 .

[2]  Aaron Clauset,et al.  Evaluating Overfit and Underfit in Models of Network Community Structure , 2018, IEEE Transactions on Knowledge and Data Engineering.

[3]  I. Mastromatteo,et al.  On the criticality of inferred models , 2011, 1102.1624.

[4]  Wolfgang von der Linden,et al.  Bayesian Probability Theory: Applications in the Physical Sciences , 2014 .

[5]  Guido Caldarelli,et al.  Hierarchical mutual information for the comparison of hierarchical community structures in complex networks , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Tiago P Peixoto,et al.  Parsimonious module inference in large networks. , 2012, Physical review letters.

[7]  Diego Garlaschelli,et al.  Analytical maximum-likelihood method to detect patterns in real networks , 2011, 1103.0701.

[8]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Mehran Kardar,et al.  Statistical physics of particles , 2007 .

[10]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[11]  A. Fisher,et al.  The Theory of Critical Phenomena: An Introduction to the Renormalization Group , 1992 .

[12]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[13]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  M. Newman Community detection in networks: Modularity optimization and maximum likelihood are equivalent , 2016, Physical review. E.

[16]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[17]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[18]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[19]  Leto Peel,et al.  The ground truth about metadata and community detection in networks , 2016, Science Advances.

[20]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[21]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[22]  Zhao Yang,et al.  Hierarchical benchmark graphs for testing community detection algorithms , 2017, Physical review. E.

[23]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[24]  Diego Garlaschelli,et al.  Breaking of Ensemble Equivalence in Networks. , 2015, Physical review letters.

[25]  M. Hastings Community detection as an inference problem. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Peng Wu,et al.  Multi-Objective Community Detection Based on Memetic Algorithm , 2015, PloS one.

[28]  Florent Krzakala,et al.  Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.

[29]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[30]  Giulio Cimini,et al.  Systemic Risk Analysis on Reconstructed Economic and Financial Networks , 2014, Scientific Reports.

[31]  Zhao Yang,et al.  A Comparative Analysis of Community Detection Algorithms on Artificial Networks , 2016, Scientific Reports.

[32]  Guido Caldarelli,et al.  Enhanced capital-asset pricing model for the reconstruction of bipartite financial networks. , 2016, Physical review. E.

[33]  Cristopher Moore,et al.  Scalable detection of statistically significant communities and hierarchies, using message passing for modularity , 2014, Proceedings of the National Academy of Sciences.

[34]  Peter Grünwald,et al.  Invited review of the book Statistical and Inductive Inference by Minimum Message Length , 2006 .

[35]  Vincent A. Traag,et al.  Detecting communities using asymptotical Surprise , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  H. Stanley,et al.  Introduction to Phase Transitions and Critical Phenomena , 1972 .

[37]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[38]  Jorma Rissanen,et al.  Strong optimality of the normalized ML models as universal codes and information in data , 2001, IEEE Trans. Inf. Theory.

[39]  Mark A. Pitt,et al.  Advances in Minimum Description Length: Theory and Applications , 2005 .

[40]  Tiago P. Peixoto Hierarchical block structures and high-resolution model selection in large networks , 2013, ArXiv.

[41]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Jorma Rissanen,et al.  Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.

[43]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.