Kolmogorov Complexity: Sources, Theory and Applications

1. UNIVERSALITYThe theory of Kolmogorov complexity is based on thediscovery, by Alan Turing in 1936, of the universal Turingmachine. After proposing the Turing machine as anexplanation of the notion of a computing machine, Turingfound that there exists one Turing machine which cansimulate any other Turing machine.Complexity, according to Kolmogorov, can be measuredby the length of the shortest program for a universal Turingmachine that correctly reproduces the observed data. It hasbeen shown that, although there are many universal Turingmachines(andthereforemanypossible ‘shortest’ programs),the corresponding complexities differ by at most an additiveconstant.The main thrust of the theory of Kolmogorov complexityis its ‘universality’; it strives to construct universal learningmethods based on universal coding methods. Thisapproach was originated by Solomonoff and made moreappealing to mathematicians by Kolmogorov. Typicallythese universal methods will be computable only in someweak sense. In applications, therefore, we can only hopeto approximate Kolmogorov complexity and related notions(such as randomnessdeficiency and algorithmic informationmentioned below). This special issue contains both materialon non-computable aspects of Kolmogorov complexity andmaterial on many fascinating applications based on differentways of approximating Kolmogorov complexity.2. BEGINNINGSAs we have already mentioned, the two main originators ofthe theory of Kolmogorovcomplexity were Ray Solomonoff(born 1926) and Andrei Nikolaevich Kolmogorov (1903–1987). The motivations behind their work were completelydifferent; Solomonoff was interested in inductive inferenceand artificial intelligence and Kolmogorov was interestedin the foundations of probability theory and, also, ofinformation theory. They arrived, nevertheless, at the samemathematical notion, which is now known as Kolmogorovcomplexity.In 1964 Solomonoff published his model of inductiveinference. He argued that any inference problem can bepresentedas a problemof extrapolatinga verylongsequenceof binary symbols; ‘given a very long sequence, representedby T, what is the probability that it will be followed by a

[1]  Gregory J. Chaitin,et al.  A recent technical report , 1974, SIGA.

[2]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.

[3]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[4]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[5]  Andrei N. Kolmogorov,et al.  Logical basis for information theory and probability theory , 1968, IEEE Trans. Inf. Theory.

[6]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[7]  Alexei Semenov,et al.  Algorithms: Main Ideas and Applications , 1993 .

[8]  L. Levin,et al.  THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .

[9]  A. N. Kolmogorov,et al.  Foundations of the theory of probability , 1960 .

[10]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[11]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[12]  Helly Grundbegriffe der Wahrscheinlichkeitsrechnung , 1936 .

[13]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[14]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[15]  A. N. Kolmogorov Combinatorial foundations of information theory and the calculus of probabilities , 1983 .