A Decomposition Method for Global Evaluation of Shannon Entropy and Local Estimations of Algorithmic Complexity

We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff–Levin’s theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages—Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell—and an online algorithmic complexity calculator.

[1]  Nicolas Gauvrit,et al.  Structure emerges faster during cultural transmission in children than in adults , 2015, Cognition.

[2]  Matthias M Dehmer,et al.  Novel topological descriptors for analyzing biological networks , 2010, BMC Structural Biology.

[3]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[4]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[5]  Rolf Herken,et al.  The Universal Turing Machine: A Half-Century Survey , 1992 .

[6]  Ida Pu Fundamental Data Compression , 2005 .

[7]  David Bailey,et al.  On the rapid computation of various polylogarithmic constants , 1997, Math. Comput..

[8]  Hector Zenil,et al.  Small Data Matters, Correlation versus Causation and Algorithmic Data Analytics , 2013 .

[9]  Jean-Paul Delahaye,et al.  Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures , 2012, Comput..

[10]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[11]  Hector Zenil,et al.  Coding-theorem like behaviour and emergence of the universal distribution from resource-bounded algorithmic probability , 2017, Int. J. Parallel Emergent Distributed Syst..

[12]  Rodney G. Downey,et al.  Algorithmic Randomness and Complexity , 2010, Theory and Applications of Computability.

[13]  Ofi rNw8x'pyzm,et al.  The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002 .

[14]  Hector Zenil,et al.  Natural scene statistics mediate the perception of image complexity , 2014, ArXiv.

[15]  Robert P. Daley An Example of Information and Computation Resource Trade-Off , 1973, JACM.

[16]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[17]  Jean-Paul Delahaye,et al.  Two-dimensional Kolmogorov complexity and an empirical validation of the Coding theorem method by compressibility , 2012, PeerJ Comput. Sci..

[18]  Matthias Dehmer,et al.  A NOVEL METHOD FOR MEASURING THE STRUCTURAL INFORMATION CONTENT OF NETWORKS , 2008, Cybern. Syst..

[19]  Hector Zenil,et al.  Computer Runtimes and the Length of Proofs - With an Algorithmic Probabilistic Application to Waiting Times in Automatic Theorem Proving , 2012, Computation, Physics and Beyond.

[20]  Hector Zenil,et al.  Low Algorithmic Complexity Entropy-deceiving Graphs , 2016, Physical review. E.

[21]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[22]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1993, Graduate Texts in Computer Science.

[23]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[24]  Jean-Paul Delahaye,et al.  Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness , 2011, Appl. Math. Comput..

[25]  G. A. Hedlund,et al.  Unending chess, symbolic dynamics and a problem in semigroups , 1944 .

[26]  Cristian S. Calude Information and Randomness: An Algorithmic Perspective , 1994 .

[27]  Christopher G. Langton,et al.  Studying artificial life with cellular automata , 1986 .

[28]  Jean-Paul Delahaye,et al.  Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines , 2012, PloS one.

[29]  T. Rado On non-computable functions , 1962 .

[30]  Hector Zenil,et al.  Algorithmic complexity for psychology: a user-friendly implementation of the coding theorem method , 2016, Behavior research methods.

[31]  Ray J. Solomonoff,et al.  The Application of Algorithmic Probability to Problems in Artificial Intelligence , 1985, UAI.

[32]  N. Packard,et al.  Symbolic dynamics of one-dimensional maps: Entropies, finite precision, and noise , 1982 .

[33]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[34]  Matthias Dehmer,et al.  Exploring Statistical and Population Aspects of Network Complexity , 2012, PloS one.

[35]  Hector Zenil,et al.  Methods of information theory and algorithmic complexity for network biology. , 2014, Seminars in cell & developmental biology.

[36]  R. Solomonoff A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY , 1989 .

[37]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[38]  Allen H. Brady The determination of the value of Rado’s noncomputable function Σ() for four-state Turing machines , 1983 .

[39]  Hector Zenil,et al.  Correlation of automorphism group size and topological properties with program−size complexity evaluations of graphs and complex networks , 2013, 1306.0322.

[40]  Hector Zenil,et al.  Algorithmic Data Analytics, Small Data Matters and Correlation versus Causation , 2013, 1309.1418.

[41]  André Calero Valdez,et al.  On Graph Entropy Measures for Knowledge Discovery from Publication Network Data , 2013, CD-ARES.

[42]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[43]  Cristian S. Calude,et al.  Finite state complexity , 2011, Theor. Comput. Sci..

[44]  Matthias Dehmer,et al.  Entropy and the Complexity of Graphs Revisited , 2012, Entropy.

[45]  Paul M. B. Vitányi,et al.  The miraculous universal distribution , 1997 .

[46]  Jean-Paul Delahaye,et al.  On the Kolmogorov-Chaitin Complexity for short sequences , 2007, ArXiv.

[47]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[48]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[49]  Cristian S. Calude,et al.  Most Programs Stop Quickly or Never Halt , 2006, Adv. Appl. Math..