Adaptive Coding and Prediction of Sources With Large and Infinite Alphabets

The problem of predicting a sequence x1,x2,. . . generated by a discrete source with unknown statistics is considered. Each letter xt+1 is predicted using the information on the word x1x2 hellip xt only. This problem is of great importance for data compression, because of its use to estimate probability distributions for PPM algorithms and other adaptive codes. On the other hand, such prediction is a classical problem which has received much attention. Its history can be traced back to Laplace. We address the problem where the sequence is generated by an independent and identically distributed (i.i.d.) source with some large (or even infinite) alphabet and suggest a class of new methods of prediction.

[1]  Alon Orlitsky,et al.  A lower bound on compression of unknown alphabets , 2005, Theor. Comput. Sci..

[2]  R. E. Krichevskii Universal Compression and Retrieval , 1994 .

[3]  En-Hui Yang,et al.  Grammar-based codes: A new class of universal lossless source codes , 2000, IEEE Trans. Inf. Theory.

[4]  W. Szpankowski Average Case Analysis of Algorithms on Sequences , 2001 .

[5]  László Györfi,et al.  Distribution estimation consistent in total variation and in two types of information divergence , 1992, IEEE Trans. Inf. Theory.

[6]  László Györfi,et al.  On Universal Noiseless Source Coding for Infinite Source Alphabets , 1993, Eur. Trans. Telecommun..

[7]  R. Fisher,et al.  The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population , 1943 .

[8]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[9]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[10]  Jorma Rissanen,et al.  Generalized Kraft Inequality and Arithmetic Coding , 1976, IBM J. Res. Dev..

[11]  W. Feller,et al.  An Introduction to Probability Theory and Its Applications; Vol. 1 , 1969 .

[12]  Alon Orlitsky,et al.  Bounds of compression of unknown alphabets , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[13]  Sidney J. Yakowitz,et al.  Weakly convergent nonparametric forecasting of stationary time series , 1997, IEEE Trans. Inf. Theory.

[14]  Alon Orlitsky,et al.  Performance of universal codes over infinite alphabets , 2003, Data Compression Conference, 2003. Proceedings. DCC 2003.

[15]  D. A. Bell,et al.  Information Theory and Reliable Communication , 1969 .

[16]  Dean P. Foster,et al.  Universal codes for finite sequences of integers drawn from a monotone distribution , 2002, IEEE Trans. Inf. Theory.

[17]  Alistair Moffat,et al.  Implementing the PPM data compression scheme , 1990, IEEE Trans. Commun..

[18]  Boris Ryabko,et al.  On Asymptotically Optimal Methods of Prediction and Adaptive Coding for Markov Sources , 2002, J. Complex..

[19]  Ian H. Witten,et al.  Arithmetic coding revisited , 1998, TOIS.

[20]  I. Good,et al.  THE NUMBER OF NEW SPECIES, AND THE INCREASE IN POPULATION COVERAGE, WHEN A SAMPLE IS INCREASED , 1956 .

[21]  Rafail E. Krichevskiy,et al.  Laplace's Law of Succession and Universal Encoding , 1998, IEEE Trans. Inf. Theory.

[22]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[23]  Philippe Jacquet,et al.  A universal predictor based on pattern matching , 2002, IEEE Trans. Inf. Theory.

[24]  En-Hui Yang,et al.  The universality of grammar-based codes for sources with countably infinite alphabets , 2005, IEEE Transactions on Information Theory.

[25]  Alon Orlitsky,et al.  Speaking of infinity [i.i.d. strings] , 2004, IEEE Transactions on Information Theory.

[26]  P. Billingsley,et al.  Ergodic theory and information , 1966 .

[27]  Serap A. Savari A probabilistic approach to some asymptotics in noiseless communication , 2000, IEEE Trans. Inf. Theory.

[28]  Ian H. Witten,et al.  Data Compression Using Adaptive Coding and Partial String Matching , 1984, IEEE Trans. Commun..

[29]  Alon Orlitsky,et al.  Universal compression of memoryless sources over unknown alphabets , 2004, IEEE Transactions on Information Theory.

[30]  László Györfi,et al.  There is no universal source code for an infinite source alphabet , 1994, IEEE Trans. Inf. Theory.

[31]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[32]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[33]  Alon Orlitsky,et al.  Convergence of profile based estimators , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[34]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[35]  Paul H. Algoet,et al.  Universal Schemes for Learning the Best Nonlinear Predictor Given the Infinite Past and Side Information , 1999, IEEE Trans. Inf. Theory.