The Redundancy of a Computable Code on a Noncomputable Distribution

We introduce new definitions of universal and superuniversal computable codes, which are based on a code's ability to approximate Kolmogorov complexity within the prescribed margin for all individual sequences from a given set. Such sets of sequences may be singled out almost surely with respect to certain probability measures. Consider a measure parameterized with a real parameter and put an arbitrary prior on the parameter. The Bayesian measure is the expectation of the parameterized measure with respect to the prior. It appears that a modified Shannon-Fano code for any computable Bayesian measure, which we call the Bayesian code, is superuniversal on a set of parameterized measure-almost all sequences for prior-almost every parameter. According to this result, in the typical setting of mathematical statistics no computable code enjoys redundancy which is ultimately much less than that of the Bayesian code. Thus we introduce another characteristic of computable codes: The catch-up time is the length of data for which the code length drops below the Kolmogorov complexity plus the prescribed margin. Some codes may have smaller catch-up times than Bayesian codes.

[1]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[2]  Jorma Rissanen,et al.  Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.

[3]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[4]  Sergio VerdÂ,et al.  The Minimum Description Length Principle in Coding and Modeling , 2000 .

[5]  Paul C. Shields,et al.  Universal redundancy rates do not exist , 1993, IEEE Trans. Inf. Theory.

[6]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[7]  Ming Li,et al.  Minimum description length induction, Bayesianism, and Kolmogorov complexity , 1999, IEEE Trans. Inf. Theory.

[8]  Steven de Rooij,et al.  Catching Up Faster by Switching Sooner: A Prequential Solution to the AIC-BIC Dilemma , 2008, ArXiv.

[9]  David Haussler,et al.  A general minimax result for relative entropy , 1997, IEEE Trans. Inf. Theory.

[10]  P. Shields Universal Redundancy Rates Don't Exist , 1993, Proceedings. IEEE International Symposium on Information Theory.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  David L. Neuhoff,et al.  Simplistic Universal Coding. , 1998, IEEE Trans. Inf. Theory.

[13]  V. Vovk Asymptotic efficiency of estimators : algorithmic approach , 1992 .

[14]  Péter Gács,et al.  Algorithmic statistics , 2000, IEEE Trans. Inf. Theory.

[15]  Flemming Topsøe,et al.  Information-theoretical optimization techniques , 1979, Kybernetika.

[16]  V. Vovk Superefficiency from the Vantage Point of Computability , 2008, 0808.2266.

[17]  A. Barron,et al.  Information theory and superefficiency , 1998 .

[18]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[19]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[20]  Lukasz Debowski,et al.  On the Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts , 2008, IEEE Transactions on Information Theory.

[21]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[22]  Steven de Rooij,et al.  Catching Up Faster by Switching Sooner , 2011 .

[23]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.