A Kolmogorov Complexity-based Genetic Programming Tool for String Compression

By following the guidelines set in one of our previous papers, in this paper we face the problem of Kolmogorov complexity-estimate for binary strings by making use of a Genetic Programming approach. This consists in evolving a population of Lisp programs looking for the "optimal" program that generates a given string. By taking into account several target binary strings belonging to different formal languages, we show the effectiveness of our approach in obtaining an approximation from the above of the Kolmogorov complexity function. Moreover, the adequate choice of "similar" target strings allows our system to show very interesting computational strategies. Experimental results indicate that our tool achieves promising compression rates for binary strings belonging to formal languages. Furthermore, even for more complicated strings our method can work, provided that some degree of loss is accepted. These results constitute a first step in using Kolmogorov complexity for string compression.

[1]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[2]  John R. Koza,et al.  Genetic programming 2 - automatic discovery of reusable programs , 1994, Complex Adaptive Systems.

[3]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[4]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[5]  Wolfgang Banzhaf,et al.  Genetic Programming for Pedestrians , 1993, ICGA.

[6]  Ivanoe De Falco,et al.  Genetic Programming Estimates of Kolmogorov Complexity , 1997, ICGA.

[7]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[8]  Gregory J. Chaitin,et al.  The Limits of Mathematics , 1995, J. Univers. Comput. Sci..

[9]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[10]  Peter Nordin,et al.  Programmatic compression of images and sound , 1996 .

[11]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[12]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[13]  Alexander Gammerman,et al.  Kolmogorov Complexity: Sources, Theory and Applications , 1999, Comput. J..

[14]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[15]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[16]  Nichael Lynn Cramer,et al.  A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.