The empirical distribution of rate-constrained source codes

Let X = (X/sub 1/,...) be a stationary ergodic finite-alphabet source, X/sup n/ denote its first n symbols, and Y/sup n/ be the codeword assigned to X/sup n/ by a lossy source code. The empirical kth-order joint distribution Q/spl circ//sup k/[X/sup n/,Y/sup n//spl rceil/(x/sup k/,y/sup k/) is defined as the frequency of appearances of pairs of k-strings (x/sup k/,y/sup k/) along the pair (X/sup n/,Y/sup n/). Our main interest is in the sample behavior of this (random) distribution. Letting I(Q/sup k/) denote the mutual information I(X/sup k/;Y/sup k/) when (X/sup k/,Y/sup k/)/spl sim/Q/sup k/ we show that for any (sequence of) lossy source code(s) of rate /spl les/R lim sup/sub n/spl rarr//spl infin//(1/k)I(Q/spl circ//sup k/[X/sup n/,Y/sup n//spl rfloor/) /spl les/R+(1/k)H (X/sub 1//sup k/)-H~(X) a.s. where H~(X) denotes the entropy rate of X. This is shown to imply, for a large class of sources including all independent and identically distributed (i.i.d.). sources and all sources satisfying the Shannon lower bound with equality, that for any sequence of codes which is good in the sense of asymptotically attaining a point on the rate distortion curve Q/spl circ//sup k/[X/sup n/,Y/sup n//spl rfloor//spl rArr//sup d/P(X/sup k/,Y~/sup k/) a.s. whenever P(/sub X//sup k//sub ,Y//sup k/) is the unique distribution attaining the minimum in the definition of the kth-order rate distortion function. Consequences of these results include a new proof of Kieffer's sample converse to lossy source coding, as well as performance bounds for compression-based denoisers.

[1]  Sanjeev Khudanpur,et al.  Typicality of a Good Rate-Distortion Code , 2004 .

[2]  J. M. Bilbao,et al.  Contributions to the Theory of Games , 2005 .

[3]  Robert M. Gray,et al.  Information rates of autoregressive processes , 1970, IEEE Trans. Inf. Theory.

[4]  David L. Donoho,et al.  The Kolmogorov Sampler , 2002 .

[5]  John C. Kieffer,et al.  Sample converses in source coding theory , 1991, IEEE Trans. Inf. Theory.

[6]  J. Rissanen,et al.  Normalized Maximum Likelihood Models for Boolean Regression with Application to Prediction and Classification in Genomics , 2003 .

[7]  Tsachy Weissman,et al.  Universal discrete denoising , 2002, Proceedings of the IEEE Information Theory Workshop.

[8]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[9]  Tsachy Weissman Not All Universal Source Codes Are Pointwise Universal , 2004 .

[10]  Jacob Ziv,et al.  Coding of sources with unknown statistics-II: Distortion relative to a fidelity criterion , 1972, IEEE Trans. Inf. Theory.

[11]  Jorma Rissanen,et al.  MDL Denoising , 2000, IEEE Trans. Inf. Theory.

[12]  Konstantinos Konstantinides,et al.  Occam filters for stochastic sources with application to digital images , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[13]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[14]  Toby Berger,et al.  Rate distortion theory : a mathematical basis for data compression , 1971 .

[15]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[16]  Tsachy Weissman,et al.  On competitive prediction and its relation to rate-distortion theory , 2003, IEEE Trans. Inf. Theory.

[17]  R. Gray Rate distortion functions for finite-state finite-alphabet Markov sources , 1971, IEEE Trans. Inf. Theory.

[18]  Martin Vetterli,et al.  Bridging Compression to Wavelet Thresholding as a Denoising Method , 1997 .

[19]  Neri Merhav,et al.  Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[20]  En-Hui Yang,et al.  Simple universal lossy data compression schemes derived from the Lempel-Ziv algorithm , 1996, IEEE Trans. Inf. Theory.

[21]  J. Kieffer An Almost Sure Convergence Theorem For Sequences of Random Variables Selected From Log-Convex Sets , 1991 .

[22]  David L. Neuhoff,et al.  Simplistic Universal Coding. , 1998, IEEE Trans. Inf. Theory.

[23]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[24]  Toby Berger,et al.  Lossy Source Coding , 1998, IEEE Trans. Inf. Theory.

[25]  R. Gallager Information Theory and Reliable Communication , 1968 .

[26]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[27]  A. Dembo,et al.  The minimax distortion redundancy in noisy source coding , 2002, Proceedings IEEE International Symposium on Information Theory,.

[28]  Balas K. Natarajan Filtering random noise from deterministic signals via data compression , 1995, IEEE Trans. Signal Process..

[29]  Balas K. Natarajan Filtering random noise via data compression , 1993, [Proceedings] DCC `93: Data Compression Conference.

[30]  Tsachy Weissman,et al.  The empirical distribution of rate-constrained source codes , 2004, ISIT.

[31]  S. Shamai,et al.  The empirical distribution of good codes , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[32]  E. Samuel An Empirical Bayes Approach to the Testing of Certain Parametric Hypotheses , 1963 .

[33]  Ioannis Kontoyiannis,et al.  Pointwise redundancy in lossy data compression and universal lossy data compression , 2000, IEEE Trans. Inf. Theory.

[34]  Jacob Ziv,et al.  Distortion-rate theory for individual sequences , 1980, IEEE Trans. Inf. Theory.

[35]  R. G. Gallager,et al.  Coding of Sources With Unknown Statistics- Part II: Distortion Relative to a Fidelity Criterion , 1972 .