Computable Bayesian Compression for Uniformly Discretizable Statistical Models

Supplementing Vovk and V'yugin's 'if' statement, we show that Bayesian compression provides the best enumerable compression for parameter-typical data if and only if the parameter is Martin-Lof random with respect to the prior. The result is derived for uniformly discretizable statistical models, introduced here. They feature the crucial property that given a discretized parameter, we can compute how much data is needed to learn its value with little uncertainty. Exponential families and certain nonparametric models are shown to be uniformly discretizable.

[1]  Hayato Takahashi,et al.  On a definition of random sequences with respect to conditional probability , 2006, Inf. Comput..

[2]  Ming Li,et al.  Minimum description length induction, Bayesianism, and Kolmogorov complexity , 1999, IEEE Trans. Inf. Theory.

[3]  H. Jeffreys,et al.  The Theory of Probability , 1896 .

[4]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .

[5]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  I. Csiszar,et al.  The consistency of the BIC Markov order estimator , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[8]  T. Speed,et al.  Data compression and histograms , 1992 .

[9]  C. D. Litton,et al.  Theory of Probability (3rd Edition) , 1984 .

[10]  O. Barndorff-Nielsen Information And Exponential Families , 1970 .

[11]  Lukasz Debowski,et al.  On the Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts , 2008, IEEE Transactions on Information Theory.

[12]  Lei Li,et al.  Iterated logarithmic expansions of the pathwise code lengths for exponential families , 2000, IEEE Trans. Inf. Theory.

[13]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[14]  Vladimir Vovk,et al.  Prequential Level of Impossibility with Some Applications , 1994 .

[15]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .

[16]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[17]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[18]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[19]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[20]  W. Fitch Random sequences. , 1983, Journal of molecular biology.

[21]  V. Vovk,et al.  On the Empirical Validity of the Bayesian Method , 1993 .

[22]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1993, Graduate Texts in Computer Science.