The redundancy and distribution of the phrase lengths of the fixed-database Lempel-Ziv algorithm

The fixed-database version of the Lempel-Ziv algorithm closely resembles many versions that appear in practice. We ascertain several key asymptotic properties of the algorithm as applied to sources with finite memory. First, we determine that for a dictionary of size n, the algorithm achieves a redundancy /spl rho//sub n/=Hlog log n/log n+0(log log n/log n) where H is the entropy of the process. This is the first, nontrivial, lower bound on any Lempel-Ziv-type compression scheme. We then find the limiting distribution and all moments of the lengths of the phrases by comparing them to a random-walk-like variable with well-known behavior.

[1]  Guy Louchard,et al.  Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm , 1995, IEEE Trans. Inf. Theory.

[2]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .

[3]  Philippe Jacquet,et al.  Autocorrelation on Words and Its Applications - Analysis of Suffix Trees by String-Ruler Approach , 1994, J. Comb. Theory A.

[4]  Wojciech Szpankowski,et al.  A Generalized Suffix Tree and its (Un)expected Asymptotic Behaviors , 1993, SIAM J. Comput..

[5]  Erhan Çinlar,et al.  Introduction to stochastic processes , 1974 .

[6]  Guy Louchard,et al.  On the average redundancy rate of the Lempel-Ziv code , 1997, IEEE Trans. Inf. Theory.

[7]  A. Barbour,et al.  Poisson Approximation , 1992 .

[8]  Aaron D. Wyner,et al.  Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression , 1989, IEEE Trans. Inf. Theory.

[9]  C. Stein Approximate computation of expectations , 1986 .

[10]  Guy Louchard,et al.  Average redundancy rate of the Lempel-Ziv code , 1996, Proceedings of Data Compression Conference - DCC '96.

[11]  Aaron D. Wyner,et al.  Fixed data base version of the Lempel-Ziv data compression algorithm , 1991, IEEE Trans. Inf. Theory.

[12]  Serap A. Savari,et al.  Redundancy of the Lempel-Ziv incremental parsing rule , 1997, IEEE Trans. Inf. Theory.

[13]  Marcelo Weinberger,et al.  Upper Bounds On The Probability Of Sequences Emitted By Finite-state Sources And On The Redundancy Of The Lempel-Ziv Algorithm , 1991, Proceedings. 1991 IEEE International Symposium on Information Theory.

[14]  Aaron D. Wyner,et al.  Improved redundancy of a version of the Lempel-Ziv algorithm , 1995, IEEE Trans. Inf. Theory.

[15]  Thomas M. Cover,et al.  The entropy of a randomly stopped sequence , 1991, IEEE Trans. Inf. Theory.