An on-line universal lossy data compression algorithm via continuous codebook refinement - Part I: Basic results

A new on-line universal lossy data compression algorithm is presented. For finite memoryless sources with unknown statistics, its performance asymptotically approaches the fundamental rate distortion limit. The codebook is generated on the fly, and continuously adapted by simple rules. There is no separate codebook training or codebook transmission. Candidate codewords are randomly generated according to an arbitrary and possibly suboptimal distribution. Through a carefully designed "gold washing" or "information-theoretic sieve" mechanism, good codewords and only good codewords are promoted to permanent status with high probability. We also determine the rate at which our algorithm approaches the fundamental limit.

[1]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[2]  Allen Gersho,et al.  Adaptive vector quantization , 1992 .

[3]  Tamás Linder,et al.  Fixed-rate universal lossy source coding and rates of convergence for memoryless sources , 1995, IEEE Trans. Inf. Theory.

[4]  Robert E. Tarjan,et al.  A Locally Adaptive Data , 1986 .

[5]  R. Gray,et al.  Asymptotically Mean Stationary Measures , 1980 .

[6]  Yossef Steinberg,et al.  An algorithm for source coding subject to a fidelity criterion, based on string matching , 1993, IEEE Trans. Inf. Theory.

[7]  Kenneth M. Mackenthun Variable-rate, weakly- and strongly-universal source coding subject to a fidelity constraint (Ph.D. Thesis abstr.) , 1977, IEEE Trans. Inf. Theory.

[8]  Toby Berger,et al.  Rate-distortion for correlated sources with partially separated encoders , 1982, IEEE Trans. Inf. Theory.

[9]  Toby Berger,et al.  Rate distortion theory : a mathematical basis for data compression , 1971 .

[10]  Toby Berger,et al.  Minimum breakdown degradation in binary source encoding , 1983, IEEE Trans. Inf. Theory.

[11]  Robert M. Gray,et al.  Block source coding theory for asymptotically mean stationary sources , 1984, IEEE Trans. Inf. Theory.

[12]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[13]  Michelle Effros,et al.  Variable dimension weighted universal vector quantization and noiseless coding , 1994, Proceedings of IEEE Data Compression Conference (DCC'94).

[14]  Belle Wei,et al.  Systolic implementations of a move-to-front text compressor , 1989, SPAA 1989.

[15]  Katalin Marton,et al.  Error exponent for source coding with a fidelity criterion , 1974, IEEE Trans. Inf. Theory.

[16]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[17]  Robert M. Gray,et al.  Source coding theorems without the ergodic assumption , 1974, IEEE Trans. Inf. Theory.

[18]  Zhen Zhang,et al.  An on-line universal lossy data compression algorithm via continuous codebook refinement - Part II. Optimality for phi-mixing source models , 1996, IEEE Trans. Inf. Theory.

[19]  Lyman P. Hurd,et al.  Fractal image compression , 1993 .

[20]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.

[21]  John C. Kieffer,et al.  A survey of the theory of source coding with a fidelity criterion , 1993, IEEE Trans. Inf. Theory.

[22]  Kenneth Zeger,et al.  Fixed rate universal lossy source coding for memoryless sources and rates of convergence , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[23]  R. Durrett Probability: Theory and Examples , 1993 .

[24]  Venkat Anantharam A large deviations approach to error exponents in source coding and hypothesis testing , 1990, IEEE Trans. Inf. Theory.

[25]  Robert G. Gallager,et al.  Variations on a theme by Huffman , 1978, IEEE Trans. Inf. Theory.

[26]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[27]  Jorma Rissanen,et al.  Generalized Kraft Inequality and Arithmetic Coding , 1976, IBM J. Res. Dev..

[28]  Tamás Linder,et al.  Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding , 1994, IEEE Trans. Inf. Theory.

[29]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[30]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[31]  Toby Berger Rate Distortion Theory for Sources with Abstract Alphabets and Memory , 1968, Inf. Control..

[32]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[33]  Peter Elias,et al.  Interval and recency rank source coding: Two on-line adaptive variable-length schemes , 1987, IEEE Trans. Inf. Theory.

[34]  Allen Gersho,et al.  Adaptive vector quantization by progressive codevector replacement , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35]  Norman Abramson,et al.  Information theory and coding , 1963 .

[36]  N. J. A. Sloane,et al.  Sphere Packings, Lattices and Groups , 1987, Grundlehren der mathematischen Wissenschaften.

[37]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[38]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[39]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[40]  Glen G. Langdon,et al.  Arithmetic Coding , 1979 .

[41]  Zhen Zhang,et al.  The redundancy of source coding with a fidelity criterion: 1. Known statistics , 1997, IEEE Trans. Inf. Theory.

[42]  Michel Barlaud,et al.  Image coding using wavelet transform , 1992, IEEE Trans. Image Process..

[43]  David L. Neuhoff,et al.  Fixed rate universal block source coding with a fidelity criterion , 1975, IEEE Trans. Inf. Theory.

[44]  Allen Gersho,et al.  Image compression with variable block size segmentation , 1992, IEEE Trans. Signal Process..

[45]  Katalin Marton,et al.  A simple proof of the blowing-up lemma , 1986, IEEE Trans. Inf. Theory.

[46]  Rudolf Ahlswede,et al.  Coloring hypergraphs: A new approach to multi-user source coding, 1 , 1979 .

[47]  David L. Neuhoff,et al.  Strong universal source coding subject to a rate-distortion constraint , 1982, IEEE Trans. Inf. Theory.

[48]  David L. Neuhoff,et al.  New results on coding of stationary nonergodic sources , 1979, IEEE Trans. Inf. Theory.

[49]  Donald E. Knuth,et al.  Dynamic Huffman Coding , 1985, J. Algorithms.

[50]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[51]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[52]  John C. Kieffer,et al.  Sample converses in source coding theory , 1991, IEEE Trans. Inf. Theory.

[53]  Michelle Effros,et al.  A vector quantization approach to universal noiseless coding and quantization , 1996, IEEE Trans. Inf. Theory.

[54]  R. Gray,et al.  A Generalization of Ornstein's $\bar d$ Distance with Applications to Information Theory , 1975 .

[55]  Thomas M. Cover,et al.  A Proof of the Data Compression Theorem of Slepian and Wolf for Ergodic Sources , 1971 .

[56]  En-Hui Yang,et al.  Distortion program-size complexity with respect to a fidelity criterion and rate-distortion function , 1993, IEEE Trans. Inf. Theory.

[57]  John C. Kieffer Finite-state adaptive block to variable-length noiseless coding of a nonstationary information source , 1989, IEEE Trans. Inf. Theory.

[58]  D. Ornstein,et al.  Universal Almost Sure Data Compression , 1990 .

[59]  Aaron D. Wyner,et al.  Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression , 1989, IEEE Trans. Inf. Theory.

[60]  En-Hui Yang,et al.  Simple universal lossy data compression schemes derived from the Lempel-Ziv algorithm , 1996, IEEE Trans. Inf. Theory.

[61]  Nasser M. Nasrabadi,et al.  Image compression using address-vector quantization , 1990, IEEE Trans. Commun..

[62]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[63]  Oscal T.-C. Chen,et al.  An adaptive high-speed lossy data compression , 1992, Data Compression Conference, 1992..

[64]  William Equitz,et al.  Successive refinement of information , 1991, IEEE Trans. Inf. Theory.

[65]  D. Huffman A Method for the Construction of Minimum-Redundancy Codes , 1952 .

[66]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[67]  John C. Kieffer Strong converses in source coding relative to a fidelity criterion , 1991, IEEE Trans. Inf. Theory.

[68]  Jacob Ziv,et al.  Coding of sources with unknown statistics-II: Distortion relative to a fidelity criterion , 1972, IEEE Trans. Inf. Theory.

[69]  Toby Berger,et al.  New results in binary multiple descriptions , 1987, IEEE Trans. Inf. Theory.

[70]  David L. Neuhoff,et al.  Fixed-rate universal codes for Markov sources , 1978, IEEE Trans. Inf. Theory.

[71]  Jacob Ziv,et al.  Coding of sources with unknown statistics-I: Probability of encoding error , 1972, IEEE Trans. Inf. Theory.

[72]  John C. Kieffer Fixed-rate encoding of nonstationary information sources , 1987, IEEE Trans. Inf. Theory.

[73]  Oscal T.-C. Chen,et al.  An adaptive vector quantizer based on the Gold-Washing method for image compression , 1994, IEEE Trans. Circuits Syst. Video Technol..