Fundamental limits of universal variable-to-fixed length coding of parametric sources

Universal variable-to-fixed (V-F) length coding of d-dimensional exponential family of distributions is considered. An achievable scheme is proposed, which consists of a dictionary to parse the source output stream. The previously-introduced notion of quantized type is employed for the dictionary construction. The quantized type class of a sequence is based on partitioning the space of minimal sufficient statistics into cuboids. The proposed dictionary consists of sequences in the boundaries of transition from low to high quantized type class size. Asymptotics of the ∊-coding rate of the proposed coding scheme for large enough dictionaries is derived. In particular, we show that the third-order coding rate of the proposed scheme is H d/2 log log M/log M, where H is the entropy of the source and M is the dictionary size. We further provide a converse, showing that this rate is optimal up to the third-order term.

[1]  Sergio Verdú,et al.  Optimal Lossless Data Compression: Non-Asymptotics and Asymptotics , 2014, IEEE Transactions on Information Theory.

[2]  Michael Drmota,et al.  Precise minimax redundancy and regret , 2004, IEEE Transactions on Information Theory.

[3]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[4]  Oliver Kosut,et al.  Universal coding with point type classes , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).

[5]  Oliver Kosut,et al.  A new Type Size code for universal one-to-one compression of parametric sources , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[6]  Sanjeev R. Kulkarni,et al.  Universal variable-to-fixed length source codes , 2001, IEEE Trans. Inf. Theory.

[7]  Andrew R. Barron,et al.  Asymptotic minimax regret for data compression, gambling, and prediction , 1997, IEEE Trans. Inf. Theory.

[8]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[9]  Oliver Kosut,et al.  Universal fixed-to-variable source coding in the finite blocklength regime , 2013, 2013 IEEE International Symposium on Information Theory.

[10]  John C. Lawrence A new universal coding scheme for the binary memoryless source , 1977, IEEE Trans. Inf. Theory.

[11]  Vincent Yan Fu Tan,et al.  Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities , 2014, Found. Trends Commun. Inf. Theory.

[12]  Oliver Kosut,et al.  Fine Asymptotics for Universal One-to-One Compression of Parametric Sources , 2016, IEEE Transactions on Information Theory.

[13]  David L. Neuhoff,et al.  Variable-to-fixed length codes provide better large deviations performance than fixed-to-variable length codes , 1992, IEEE Trans. Inf. Theory.

[14]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[15]  Michael Drmota,et al.  Tunstall Code, Khodak Variations, and Random Walks , 2010, IEEE Transactions on Information Theory.

[16]  Neri Merhav,et al.  On universal simulation of information sources using training data , 2004, IEEE Transactions on Information Theory.

[17]  Frans M. J. Willems,et al.  A universal variable-to-fixed length source code based on Lawrence's algorithm , 1992, IEEE Trans. Inf. Theory.

[18]  Brian Parker Tunstall,et al.  Synthesis of noiseless compression codes , 1967 .

[19]  Oliver Kosut,et al.  Asymptotics and Non-Asymptotics for Universal Fixed-to-Variable Source Coding , 2014, IEEE Transactions on Information Theory.