Universal Compression of Envelope Classes: Tight Characterization via Poisson Sampling

The Poisson-sampling technique eliminates dependencies among symbol appearances in a random sequence. It has been used to simplify the analysis and strengthen the performance guarantees of randomized algorithms. Applying this method to universal compression, we relate the redundancies of fixed-length and Poisson-sampled sequences, use the relation to derive a simple single-letter formula that approximates the redundancy of any envelope class to within an additive logarithmic term. As a first application, we consider i.i.d. distributions over a small alphabet as a step-envelope class, and provide a short proof that determines the redundancy of discrete distributions over a small alphabet up to the first order terms. We then show the strength of our method by applying the formula to tighten the existing bounds on the redundancy of exponential and power-law classes, in particular answering a question posed by Boucheron, Garivier and Gassiat [6].

[1]  Max Buot Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2006 .

[2]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[3]  Alon Orlitsky,et al.  Speaking of infinity [i.i.d. strings] , 2004, IEEE Transactions on Information Theory.

[4]  Mesrob I. Ohannessian,et al.  About Adaptive Coding on Countable Alphabets: Max-Stable Envelope Classes , 2014, IEEE Transactions on Information Theory.

[5]  Alon Orlitsky,et al.  Tight bounds for universal compression of large alphabets , 2013, 2013 IEEE International Symposium on Information Theory.

[6]  Aurélien Garivier,et al.  Coding on Countably Infinite Alphabets , 2008, IEEE Transactions on Information Theory.

[7]  W. Szpankowski ON ASYMPTOTICS OF CERTAIN RECURRENCES ARISING IN UNIVERSAL CODING , 1998 .

[8]  Alon Orlitsky,et al.  Tight Bounds on Profile Redundancy and Distinguishability , 2012, NIPS.

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  Andrew R. Barron,et al.  Asymptotic minimax regret for data compression, gambling, and prediction , 1997, IEEE Trans. Inf. Theory.

[11]  Dominique Bontemps Universal Coding on Infinite Alphabets: Exponentially Decreasing Envelopes , 2011, IEEE Transactions on Information Theory.

[12]  Gil I. Shamir Universal Source Coding for Monotonic and Fast Decaying Monotonic Distributions , 2013, IEEE Transactions on Information Theory.

[13]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[14]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[15]  A. Barron,et al.  LARGE ALPHABET CODING AND PREDICTION THROUGH POISSONIZATION AND TILTING , 2013 .

[16]  Michael B. Pursley,et al.  Efficient universal noiseless source codes , 1981, IEEE Trans. Inf. Theory.

[17]  Stéphane Boucheron,et al.  About Adaptive Coding on Countable Alphabets , 2012, IEEE Transactions on Information Theory.

[18]  John C. Kieffer,et al.  A unified approach to weak universal source coding , 1978, IEEE Trans. Inf. Theory.

[19]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[20]  Alon Orlitsky,et al.  Universal compression of memoryless sources over unknown alphabets , 2004, IEEE Transactions on Information Theory.

[21]  Wojciech Szpankowski,et al.  Minimax redundancy for large alphabets , 2010, 2010 IEEE International Symposium on Information Theory.

[22]  T. Cover Universal Portfolios , 1996 .

[23]  Maryam Hosseini,et al.  On redundancy of memoryless sources over countable alphabets , 2014, 2014 International Symposium on Information Theory and its Applications.

[24]  Dean P. Foster,et al.  Universal codes for finite sequences of integers drawn from a monotone distribution , 2002, IEEE Trans. Inf. Theory.

[25]  Alon Orlitsky,et al.  Efficient compression of monotone and m-modal distributions , 2014, 2014 IEEE International Symposium on Information Theory.