The unrestricted LZ78 universal data-compression algorithm (as well as the LZ77 and LZW versions) achieves asymptotically, as the block-length tends to infinity, the FS compressibility, namely the best compression-ratio that may be achieved by any Information-lossless(IL) block-to-variable finite-state(FS) algorithm, for any infinitely-long individual sequence. The encoder parses the sequence into distinct phrases where each newly generated phrase is a past phrase which is already stored in a dictionary, extended by one letter. The newly generated phrase is then added to the updated, ever-growing dictionary. One heuristic approach is the "Least Recently Utilized" (LRU) deletion approach, where only the most recent D entries are kept in the dictionary, thus yielding a constrained-dictionary version of LZ78 denoted by LZ78(LRU). In this note, for the sake of completeness, it is demonstrated again via a simple proof that the unrestricted LZ78 algorithm asymptotically achieves the FS-Compressibility. Then, it is demonstrated that the LZ78(LRU) information-lossless data-compression algorithm also achieves the FS compressibility, as the dictionary size D tends to infinity. Although this is perhaps not surprising, it does nevertheless yield a theoretical optimality argument for the popular LZ78(LRU) algorithm (and similarly, for the LZW(LRU) algorithm). In addition, the finite-state compressibility of an individual sequence under a constrained allowable distance measure between the original sequence and the decompressed sequence is defined. It is demonstrated that a particular adaptive vector-quantizer that sequentially replaces clusters of L-vectors onto a single, cluster-representative L-vector, followed by a constrained D-entries-dictionary version of LZ78(LRU) as above, is asymptotically optimal as D tends to infinity and L= log D .
[1]
R. Gallager.
Information Theory and Reliable Communication
,
1968
.
[2]
Abraham Lempel,et al.
A universal algorithm for sequential data compression
,
1977,
IEEE Trans. Inf. Theory.
[3]
Terry A. Welch,et al.
A Technique for High-Performance Data Compression
,
1984,
Computer.
[4]
Abraham Lempel,et al.
Compression of individual sequences via variable-rate coding
,
1978,
IEEE Trans. Inf. Theory.
[5]
Sergio De Agostino.
Bounded size dictionary compression: relaxing the LRU deletion heuristic
,
2005,
Data Compression Conference.