Improving Inference for Neural Image Compression

We consider the problem of lossy image compression with deep latent variable models. State-of-the-art methods build on hierarchical variational autoencoders (VAEs) and learn inference networks to predict a compressible latent representation of each data point. Drawing on the variational inference perspective on compression, we identify three approximation gaps which limit performance in the conventional approach: (i) an amortization gap, (ii) a discretization gap, and (iii) a marginalization gap. We propose improvements to each of these three shortcomings based on ideas related to iterative inference, stochastic annealing for discrete optimization, and bits-back coding, resulting in the first application of bits-back coding to lossy compression. In our experiments, which include extensive baseline comparisons and ablation studies, we achieve new state-of-the-art performance on lossy image compression using an established VAE architecture, by changing only the inference method.

[1]  C. S. Wallace,et al.  Classification by Minimum-Message-Length Inference , 1991, ICCI.

[2]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[3]  Glenn Randers-Pehrson MNG: a multiple-image format in the PNG family , 1997, World Wide Web J..

[4]  Touradj Ebrahimi,et al.  The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[5]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[6]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[7]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[8]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[9]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Nicola Asuni,et al.  TESTIMAGES: a Large-scale Archive for Testing Visual Devices and Basic Image Processing Algorithms , 2014, STAG.

[12]  David Minnen,et al.  Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[13]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[14]  Nebojsa Jojic,et al.  Iterative Refinement of the Approximate Posterior for Directed Belief Networks , 2015, NIPS.

[15]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[16]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[17]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[18]  Alexander A. Alemi,et al.  An Information-Theoretic Analysis of Deep Latent-Variable Models , 2017, ArXiv.

[19]  Luca Benini,et al.  Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[20]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[21]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[22]  Yisong Yue,et al.  Iterative Amortized Inference , 2018, ICML.

[23]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[24]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[25]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[26]  Alexander A. Alemi,et al.  Fixing a Broken ELBO , 2017, ICML.

[27]  Matthew D. Hoffman,et al.  On the challenges of learning with inference networks on sparse, high-dimensional data , 2017, AISTATS.

[28]  Alexander M. Rush,et al.  Semi-Amortized Variational Autoencoders , 2018, ICML.

[29]  Elad Eban,et al.  Computationally Efficient Neural Image Compression , 2019, ArXiv.

[30]  Jack Xin,et al.  Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.

[31]  Taco S. Cohen,et al.  Video Compression With Rate-Distortion Autoencoders , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Jooyoung Lee,et al.  Context-adaptive Entropy Model for End-to-end Optimized Image Compression , 2018, ICLR.

[33]  Stephan Mandt,et al.  Deep Generative Video Compression , 2018, NeurIPS.

[34]  Pieter Abbeel,et al.  Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables , 2019, ICML.

[35]  Pieter Abbeel,et al.  Compression with Flows via Local Bits-Back Coding , 2019, NeurIPS.

[36]  Hedvig Kjellström,et al.  Advances in Variational Inference , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  David Barber,et al.  Practical Lossless Compression with Latent Variables using Bits Back Coding , 2019, ICLR.

[38]  Abdelaziz Djelouah,et al.  Content Adaptive Optimization for Neural Image Compression , 2019, CVPR Workshops.

[39]  David Barber,et al.  HiLLoC: Lossless Image Compression with Hierarchical Latent Variable Models , 2019, ICLR.

[40]  S. Mandt,et al.  Variational Bayesian Quantization , 2020, ICML.

[41]  Stephan Mandt,et al.  Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding , 2020, ICML 2020.

[42]  Yibo Yang,et al.  Hierarchical Autoregressive Modeling for Neural Video Compression , 2020, ICLR.