On Context - Tree Prediction of Individual Sequences

Motivated by the evident success of context-tree based methods in lossless data compression, we explore, in this paper, methods of the same spirit in universal prediction of individual sequences. By context-tree prediction, we refer to a family of prediction schemes, where at each time instant t, after having observed all outcomes of the data sequence x1,...,xt-1, but not yet xt, the prediction is based on a "context" (or a state) that consists of the k most recent past outcomes xt-k,...,xt-1, where the choice of k may depend on the contents of a possibly longer, though limited, portion of the observed past, xt-k(max),...,xt-1. This is different from the study reported in Feder et al. (1992), where general finite-state predictors as well as "Markov" (finite-memory) predictors of fixed order, where studied in the regime of individual sequences. Another important difference between this study and Feder et al. is the asymptotic regime. While in Feder et al., the resources of the predictor (i.e., the number of states or the memory size) were kept fixed regardless of the length N of the data sequence, here we investigate situations where the number of contexts, or states, is allowed to grow concurrently with N. We are primarily interested in the following fundamental question: What is the critical growth rate of the number of contexts, below which the performance of the best context-tree predictor is still universally achievable, but above which it is not? We show that this critical growth rate is linear in N. In particular, we propose a universal context-tree algorithm that essentially achieves optimum performance as long as the growth rate is sublinear, and show that, on the other hand, this is impossible in the linear case.

[1]  Neri Merhav,et al.  On Context—Tree Prediction of Individual Sequences , 2005, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.

[2]  Guillermo Sapiro,et al.  LOCO-I: a low complexity, context-based, lossless image compression algorithm , 1996, Proceedings of Data Compression Conference - DCC '96.

[3]  Jacob Ziv,et al.  Correction to: "An Efficient Universal Prediction Algorithm for Unknown Sources With Limited Training Data" , 2004, IEEE Transactions on Information Theory.

[4]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[5]  Gadiel Seroussi,et al.  Linear time universal coding and time reversal of tree sources via FSM closure , 2004, IEEE Transactions on Information Theory.

[6]  Neri Merhav,et al.  On the Wyner-Ziv problem for individual sequences , 2006, IEEE Transactions on Information Theory.

[7]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[8]  Frans M. J. Willems,et al.  The Context-Tree Weighting Method : Extensions , 1998, IEEE Trans. Inf. Theory.

[9]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[10]  Gadiel Seroussi,et al.  Sequential prediction and ranking in universal context modeling and data compression , 1997, IEEE Trans. Inf. Theory.

[11]  Jacob Ziv,et al.  An efficient universal prediction algorithm for unknown sources with limited training data , 2002, IEEE Trans. Inf. Theory.

[12]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[13]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[14]  Neri Merhav,et al.  Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[15]  Philippe Jacquet,et al.  A universal predictor based on pattern matching , 2002, IEEE Trans. Inf. Theory.