Local Decoding and Update of Compressed Data

In compressing large datasets it is often desirable to guarantee locality properties that allow the efficient decoding and efficient update of short fragments of data. This paper proposes a universal compression scheme for memoryless sources with the following features: 1. the rate can be made arbitrarily close to the entropy of the underlying source, 2. constant-sized (as a function of the blocklength) fragments of the source can be recovered by probing a constant number of codeword bits on average, 3. the update of constant-sized fragments of the source can be achieved by reading and modifying a constant number of codeword symbols on average, and 4. the overall encoding and decoding complexity is quasilinear in the blocklength of the source.

[1]  Tsachy Weissman,et al.  On Universal Compression with Constant Random Access , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[2]  Vivek K Goyal,et al.  On Palimpsests in Neural Memory: An Information Theory Viewpoint , 2016, IEEE Transactions on Molecular, Biological and Multi-Scale Communications.

[3]  Thomas A. Courtade,et al.  The Effect of Local Decodability Constraints on Variable-Length Compression , 2018, IEEE Transactions on Information Theory.

[4]  S. Srinivasa Rao,et al.  A Survey of Data Structures in the Bitprobe Model , 2013, Space-Efficient Data Structures, Streams, and Algorithms.

[5]  Devavrat Shah,et al.  A locally encodable and decodable compressed data structure , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[7]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[8]  J. Ian Munro,et al.  Compressed Data Structures for Dynamic Sequences , 2015, ESA.

[9]  Venkat Chandar,et al.  Sparse graph codes for compression, sensing, and secrecy , 2010 .

[10]  Peter Bro Miltersen,et al.  Are bitvectors optimal? , 2000, STOC '00.

[11]  Rajeev Raman,et al.  Succinct Dynamic Dictionaries and Trees , 2003, ICALP.

[12]  Andrea Montanari,et al.  Smooth compression, Gallager bound and nonlinear sparse-graph codes , 2008, 2008 IEEE International Symposium on Information Theory.

[13]  Gregory W. Wornell,et al.  Local recovery in data compression for general sources , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).