Markov types and minimax redundancy for Markov sources

Redundancy of universal codes for a class of sources determines by how much the actual code length exceeds the optimal code length. In the minimax scenario, one designs the best code for the worst source within the class. Such minimax redundancy comes in two flavors: average minimax or worst case minimax. We study the worst case minimax redundancy of universal block codes for Markovian sources of any order. We prove that the maximal minimax redundancy for Markov sources of order r is asymptotically equal to 1/2m/sup r/(m-1)log/sub 2/n+log/sub 2/A/sub m//sup r/-(lnlnm/sup 1/(m-1)/)/lnm+o(1), where n is the length of a source sequence, m is the size of the alphabet, and A/sub m//sup r/ is an explicit constant (e.g., we find that for a binary alphabet m=2 and Markov of order r=1 the constant A/sub 2//sup 1/=16/spl middot/G/spl ap/14.655449504 where G is the Catalan number). Unlike previous attempts, we view the redundancy problem as an asymptotic evaluation of certain sums over a set of matrices representing Markov types. The enumeration of Markov types is accomplished by reducing it to counting Eulerian paths in a multigraph. In particular, we propose exact and asymptotic formulas for the number of strings of a given Markov type. All of these findings are obtained by analytic and combinatorial tools of analysis of algorithms.

[1]  Jorma Rissanen,et al.  Complexity of strings in the class of Markov sources , 1986, IEEE Trans. Inf. Theory.

[2]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[3]  Kevin Atteson,et al.  The asymptotic redundancy of Bayes rules for Markov chains , 1999, IEEE Trans. Inf. Theory.

[4]  W. Szpankowski ON ASYMPTOTICS OF CERTAIN RECURRENCES ARISING IN UNIVERSAL CODING , 1998 .

[5]  R. Stanley,et al.  Enumerative Combinatorics: Index , 1999 .

[6]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[7]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[8]  P. Whittle,et al.  Some Distribution and Moment Formulae for the Markov Chain , 1955 .

[9]  Philippe Flajolet,et al.  Singularity Analysis of Generating Functions , 1990, SIAM J. Discret. Math..

[10]  Andrew R. Barron,et al.  Minimax redundancy for the class of memoryless sources , 1997, IEEE Trans. Inf. Theory.

[11]  Michael Drmota,et al.  Precise minimax redundancy and regret , 2004, IEEE Transactions on Information Theory.

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Lee D. Davisson,et al.  Minimax noiseless universal coding for Markov sources , 1983, IEEE Trans. Inf. Theory.

[14]  Gaston H. Gonnet,et al.  On the LambertW function , 1996, Adv. Comput. Math..

[15]  A. Barron,et al.  Asymptotic minimax regret for data compression, gambling and prediction , 1997, Proceedings of IEEE International Symposium on Information Theory.

[16]  P. Billingsley,et al.  Statistical Methods in Markov Chains , 1961 .

[17]  Y. Shtarkov,et al.  Multialphabet universal coding of memoryless sources , 1995 .

[18]  Wojciech Szpankowski,et al.  Asymptotic average redundancy of Huffman (and other) block codes , 2000, IEEE Trans. Inf. Theory.

[19]  Wojciech Szpankowski,et al.  Average Case Analysis of Algorithms on Sequences: Szpankowski/Average , 2001 .

[20]  W. Szpankowski Average Case Analysis of Algorithms on Sequences , 2001 .

[21]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[22]  Paul C. Shields,et al.  Universal redundancy rates do not exist , 1993, IEEE Trans. Inf. Theory.

[23]  En-Hui Yang,et al.  Grammar-based codes: A new class of universal lossless source codes , 2000, IEEE Trans. Inf. Theory.

[24]  L. B. Boza Asymptotically Optimal Tests for Finite Markov Chains , 1971 .