Rate-Distortion Theory for General Sets and Measures

This paper is concerned with a rate-distortion theory for sequences of i.i.d. random variables with general distribution supported on general sets including manifolds and fractal sets. Manifold structures are prevalent in data science, e.g., in compressed sensing, machine learning, image processing, and handwritten digit recognition. Fractal sets find application in image compression and in modeling of Ethernet traffic. We derive a lower bound on the (single-letter) rate-distortion function that applies to random variables $X$ of general distribution $\mu_{X}$ and for continuous $X$ reduces to the classical Shannon lower bound. Moreover, our lower bound is explicit up to a parameter obtained by solving a convex optimization problem in a nonnegative real variable. The only requirement for the bound to apply is the existence of a $\sigma$ -finite reference measure $\mu$, for $X$ (i.e., a measure $\mu$ with $\mu x\ll\mu$ and such that the generalized entropy $h_{\mu}(X)$ is finite) satisfying a certain subregularity condition. This condition is very general and prevents the reference measure $\mu$ from being highly concentrated on balls of small radii. To illustrate the wide applicability of our result, we evaluate the lower bound for a random variable distributed uniformly on a manifold, namely, the unit circle, and a random variable distributed uniformly on a self-similar set, namely, the middle third Cantor set.

[1]  M. Zerner Weak separation properties for self-similar sets , 1996 .

[2]  Erwin Riegler,et al.  Entropy and Source Coding for Integer-Dimensional Singular Random Variables , 2015, IEEE Transactions on Information Theory.

[3]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[4]  Yeshaiahu Fainman,et al.  Image manifolds , 1998, Electronic Imaging.

[5]  Erwin Riegler,et al.  Lossless Analog Compression , 2018, IEEE Transactions on Information Theory.

[6]  G. Longo Source Coding Theory , 1970 .

[7]  K. Simon,et al.  ON THE DIMENSION OF SELF-SIMILAR SETS , 2002 .

[8]  Yehoshua Y. Zeevi,et al.  Representation of colored images by manifolds embedded in higher dimensional non-Euclidean space , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[9]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[10]  Tobias Koch,et al.  The Shannon Lower Bound Is Asymptotically Tight , 2015, IEEE Transactions on Information Theory.

[11]  Richard G. Baraniuk,et al.  Random Projections of Smooth Manifolds , 2009, Found. Comput. Math..

[12]  H. Rosenthal,et al.  On the epsilon entropy of mixed random variables , 1988, IEEE Trans. Inf. Theory.

[13]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[14]  Robert M. Gray,et al.  Entropy and Information Theory -2/E. , 2014 .

[15]  Saburo Tazaki,et al.  Asymptotic performance of block quantizers with difference distortion measures , 1980, IEEE Trans. Inf. Theory.

[16]  Amir Dembo,et al.  The rate-distortion dimension of sets and measures , 1994, IEEE Trans. Inf. Theory.

[17]  S. Krantz Fractal geometry , 1989 .

[18]  L. Ambrosio,et al.  Functions of Bounded Variation and Free Discontinuity Problems , 2000 .

[19]  W. Marsden I and J , 2012 .

[20]  Alan Julian Izenman,et al.  Introduction to manifold learning , 2012 .

[21]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[22]  Aaron D. Wyner,et al.  Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .

[23]  Victoria Kostina,et al.  Data compression with low distortion and finite blocklength , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[24]  Erwin Riegler,et al.  Information-theoretic limits of matrix completion , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[25]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[26]  Toby Berger,et al.  Rate distortion theory : a mathematical basis for data compression , 1971 .

[27]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[28]  S. Graf,et al.  Foundations of Quantization for Probability Distributions , 2000 .

[29]  Tamás Linder,et al.  On the asymptotic tightness of the Shannon lower bound , 1994, IEEE Trans. Inf. Theory.