Practical Lossless Compression with Latent Variables using Bits Back Coding

Deep latent variable models have seen recent success in many data domains. Lossless compression is an application of these models which, despite having the potential to be highly useful, has yet to be implemented in a practical manner. We present `Bits Back with ANS' (BB-ANS), a scheme to perform lossless compression with latent variable models at a near optimal rate. We demonstrate this scheme by using it to compress the MNIST dataset with a variational auto-encoder model (VAE), achieving compression rates superior to standard methods with only a simple VAE. Given that the scheme is highly amenable to parallelization, we conclude that with a sufficiently high quality generative model this scheme could be used to achieve substantial improvements in compression rate with acceptable running time. We make our implementation available open source at this https URL .

[1]  Fabian Giesen,et al.  Interleaved entropy coders , 2014, ArXiv.

[2]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[3]  Jarek Duda,et al.  Asymmetric numeral systems , 2009, ArXiv.

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Max Welling,et al.  Soft Weight-Sharing for Neural Network Compression , 2017, ICLR.

[6]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[7]  Dinesh Manocha,et al.  GST , 2016 .

[8]  Eirikur Agustsson,et al.  Deep Generative Models for Distribution-Preserving Lossy Compression , 2018, NeurIPS.

[9]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[10]  Ryan P. Adams,et al.  Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[11]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[12]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[13]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[14]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[17]  Brendan J. Frey,et al.  Free energy coding , 1996, Proceedings of Data Compression Conference - DCC '96.

[18]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[19]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[20]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[22]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[23]  Brendan J. Frey,et al.  Bayesian networks for pattern classification, data compression, and channel coding , 1997 .

[24]  C. S. Wallace,et al.  Classification by Minimum-Message-Length Inference , 1991, ICCI.

[25]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[26]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[27]  Daan Wierstra,et al.  Towards Conceptual Compression , 2016, NIPS.

[28]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.