Compressibility of individual sequences by the class of generalized finite-state information-lossless encoders is investigated. These encoders can operate in a variable-rate mode as well as a fixed-rate one, and they allow for any finite-state scheme of variable-length-to-variable-length coding. For every individual infinite sequence x a quantity \rho(x) is defined, called the compressibility of x , which is shown to be the asymptotically attainable lower bound on the compression ratio that can be achieved for x by any finite-state encoder. This is demonstrated by means of a constructive coding theorem and its converse that, apart from their asymptotic significance, also provide useful performance criteria for finite and practical data-compression tasks. The proposed concept of compressibility is also shown to play a role analogous to that of entropy in classical information theory where one deals with probabilistic ensembles of sequences rather than with individual sequences. While the definition of \rho(x) allows a different machine for each different sequence to be compressed, the constructive coding theorem leads to a universal algorithm that is asymptotically optimal for all sequences.
[1]
Shimon Even.
Generalized automata and their information losslessness
,
1962,
SWCT.
[2]
Shimon Even.
On Information Lossless Automata of Finite Order
,
1965,
IEEE Trans. Electron. Comput..
[3]
R. Gallager.
Information Theory and Reliable Communication
,
1968
.
[4]
Abraham Lempel,et al.
On the Complexity of Finite Sequences
,
1976,
IEEE Trans. Inf. Theory.
[5]
Abraham Lempel,et al.
A universal algorithm for sequential data compression
,
1977,
IEEE Trans. Inf. Theory.
[6]
Jacob Ziv,et al.
Coding theorems for individual sequences
,
1978,
IEEE Trans. Inf. Theory.