A maximum entropy framework for nonexponential distributions

Significance Many statistical distributions, particularly among social and biological systems, have “heavy tails,” which are situations where rare events are not as improbable as would have been guessed from more traditional statistics. Heavy-tailed distributions are the basis for the phrase “the rich get richer.” Here, we propose a basic principle underlying systems with heavy-tailed distributions. We show that it is the same principle (maximum entropy) used in statistical physics and statistics to estimate probabilistic models from relatively few constraints. The heavy-tail principle can be expressed in terms of shared costs and economies of scale. The probability distribution we derive is a mathematical digamma function, and we show that it accurately fits 13 real-world data sets. Probability distributions having power-law tails are observed in a broad range of social, economic, and biological systems. We describe here a potentially useful common framework. We derive distribution functions for situations in which a “joiner particle” k pays some form of price to enter a community of size , where costs are subject to economies of scale. Maximizing the Boltzmann–Gibbs–Shannon entropy subject to this energy-like constraint predicts a distribution having a power-law tail; it reduces to the Boltzmann distribution in the absence of economies of scale. We show that the predicted function gives excellent fits to 13 different distribution functions, ranging from friendship links in social networks, to protein–protein interactions, to the severity of terrorist attacks. This approach may give useful insights into when to expect power-law distributions in the natural and social sciences.

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[3]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[4]  A. Rényi On Measures of Entropy and Information , 1961 .

[5]  鈴木 増雄 Time-Dependent Statistics of the Ising Model , 1965 .

[6]  Ryogo Kubo,et al.  Dynamics of the Ising Model near the Critical Point. I , 1968 .

[7]  Michael E. Fisher,et al.  The renormalization group in the theory of critical behavior , 1974 .

[8]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[9]  Benoit B. Mandelbrot,et al.  Critical Phenomena on Fractal Lattices , 1980 .

[10]  Rodney W. Johnson,et al.  Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy , 1980, IEEE Trans. Inf. Theory.

[11]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[12]  Fisher,et al.  Scaling and critical slowing down in random-field Ising systems. , 1986, Physical review letters.

[13]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[14]  A. R. Plastino,et al.  Non-extensive statistical mechanics and generalized Fokker-Planck equation , 1995 .

[15]  R. Mantegna,et al.  Scaling behaviour in the dynamics of an economic index , 1995, Nature.

[16]  Tsallis,et al.  Anomalous diffusion in the presence of external forces: Exact time-dependent solutions and their thermostatistical basis. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17]  H. Stanley,et al.  Scaling, Universality, and Renormalization: Three Pillars of Modern Critical Phenomena , 1999 .

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  S. Abe Axioms and uniqueness theorem for Tsallis entropy , 2000, cond-mat/0005538.

[20]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[21]  Z. Neda,et al.  Measuring preferential attachment in evolving networks , 2001, cond-mat/0104131.

[22]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Hawoong Jeong,et al.  Modeling the Internet's large-scale topology , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[25]  M. A. Muñoz,et al.  Scale-free networks from varying vertex intrinsic fitness. , 2002, Physical review letters.

[26]  E. Lutz Anomalous diffusion and Tsallis statistics in an optical lattice , 2003 .

[27]  A. Vázquez Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  A. Wagner,et al.  Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications , 2002, BMC Evolutionary Biology.

[29]  A. Arenas,et al.  Models of social networks based on social distance attachment. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  C. Tsallis,et al.  Nonextensive Entropy: Interdisciplinary Applications , 2004 .

[31]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[32]  L. Burlaga,et al.  Triangle for the entropic index q of non-extensive statistical mechanics observed by Voyager 1 in the distant heliosphere , 2005 .

[33]  Eric J. Deeds,et al.  A simple physical model for scaling in protein-protein interaction networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  P. Douglas,et al.  Tunable Tsallis distributions in dissipative optical lattices. , 2006, Physical review letters.

[35]  G. Caldarelli,et al.  Preferential attachment in the growth of social networks, the Internet encyclopedia wikipedia , 2007 .

[36]  A. Clauset,et al.  On the Frequency of Severe Terrorist Events , 2006, physics/0606007.

[37]  Yamir Moreno,et al.  Complex Cooperative Networks from Evolutionary Preferential Attachment , 2008, PloS one.

[38]  Constantino Tsallis,et al.  Nonadditive entropy reconciles the area law in quantum systems with classical thermodynamics. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  R. Cywinski,et al.  Generalized spin-glass relaxation. , 2009, Physical review letters.

[40]  R. DeVoe,et al.  Power-law distributions for a trapped ion interacting with a classical buffer gas. , 2009, Physical review letters.

[41]  Munmun De Choudhury,et al.  Social Synchrony: Predicting Mimicry of User Actions in Online Social Media , 2009, 2009 International Conference on Computational Science and Engineering.

[42]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[43]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[44]  Sergei Maslov,et al.  Toolbox model of evolution of prokaryotic metabolic networks and their regulation , 2009, Proceedings of the National Academy of Sciences.

[45]  Ian Bremmer,et al.  The Fat Tail: The Power of Political Knowledge for Strategic Investing , 2009 .

[46]  Steve Pressé,et al.  Nonuniversal power law scaling in the probability distribution of scientific citations , 2010, Proceedings of the National Academy of Sciences.

[47]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[48]  Jean-Yves Le Boudec,et al.  Power Law and Exponential Decay of Intercontact Times between Mobile Devices , 2007, IEEE Transactions on Mobile Computing.

[49]  Mark Gerstein,et al.  Measuring the Evolutionary Rewiring of Biological Networks , 2011, PLoS Comput. Biol..

[50]  Haruki Nakamura,et al.  HitPredict: a database of quality assessed protein–protein interactions in nine species , 2010, Nucleic Acids Res..

[51]  Marshall F Chalverus,et al.  The Black Swan: The Impact of the Highly Improbable , 2007 .

[52]  Sergei Maslov,et al.  A Toolbox Model of Evolution of Metabolic Pathways on Networks of Arbitrary Topology , 2010, PLoS Comput. Biol..

[53]  Simon A. Levin,et al.  Evolution of a modular software network , 2011, Proceedings of the National Academy of Sciences.

[54]  Ken A. Dill,et al.  Molecular driving forces : statistical thermodynamics in biology, chemistry, physics, and nanoscience , 2012 .

[55]  J. Aczel,et al.  On Measures of Information and Their Characterizations , 2012 .

[56]  Ken A. Dill,et al.  Simulated Evolution of Protein-Protein Interaction Networks with Realistic Topology , 2012, PloS one.

[57]  K. Dill,et al.  Principles of maximum entropy and maximum caliber in statistical physics , 2013 .

[58]  Sergei Maslov,et al.  Universal distribution of component frequencies in biological and technological systems , 2013, Proceedings of the National Academy of Sciences.

[59]  F. Leroy,et al.  Molecular Driving Forces. Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience , 2013 .

[60]  Armin Bunde,et al.  Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging? , 2014, PloS one.

[61]  Michal Brzezinski,et al.  Power laws in citation distributions: evidence from Scopus , 2014, Scientometrics.

[62]  Eduardo G. Altmann,et al.  Predictability of Extreme Events in Social Media , 2014, PloS one.