论文信息 - Improving Inference for Neural Image Compression - 字舞流文

Improving Inference for Neural Image Compression

We consider the problem of lossy image compression with deep latent variable models. State-of-the-art methods build on hierarchical variational autoencoders (VAEs) and learn inference networks to predict a compressible latent representation of each data point. Drawing on the variational inference perspective on compression, we identify three approximation gaps which limit performance in the conventional approach: (i) an amortization gap, (ii) a discretization gap, and (iii) a marginalization gap. We propose improvements to each of these three shortcomings based on ideas related to iterative inference, stochastic annealing for discrete optimization, and bits-back coding, resulting in the first application of bits-back coding to lossy compression. In our experiments, which include extensive baseline comparisons and ablation studies, we achieve new state-of-the-art performance on lossy image compression using an established VAE architecture, by changing only the inference method.

S. Mandt | Yibo Yang | Robert Bamler

[1] C. S. Wallace,et al. Classification by Minimum-Message-Length Inference , 1991, ICCI.

[2] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[3] Glenn Randers-Pehrson. MNG: a multiple-image format in the PNG family , 1997, World Wide Web J..

[4] Touradj Ebrahimi,et al. The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[5] G. Bjontegaard,et al. Calculation of Average PSNR Differences between RD-curves , 2001 .

[6] Matthew J. Beal,et al. The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[7] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[8] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[9] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[10] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[11] Nicola Asuni,et al. TESTIMAGES: a Large-scale Archive for Testing Visual Devices and Basic Image Processing Algorithms , 2014, STAG.

[12] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[13] David M. Blei,et al. Variational Inference: A Review for Statisticians , 2016, ArXiv.

[14] Nebojsa Jojic,et al. Iterative Refinement of the Approximate Posterior for Directed Belief Networks , 2015, NIPS.

[15] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[16] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[17] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[18] Alexander A. Alemi,et al. An Information-Theoretic Analysis of Deep Latent-Variable Models , 2017, ArXiv.

[19] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[20] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[21] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.

[22] Yisong Yue,et al. Iterative Amortized Inference , 2018, ICML.

[23] David Duvenaud,et al. Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[24] David Minnen,et al. Variational image compression with a scale hyperprior , 2018, ICLR.

[25] David Minnen,et al. Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[26] Alexander A. Alemi,et al. Fixing a Broken ELBO , 2017, ICML.

[27] Matthew D. Hoffman,et al. On the challenges of learning with inference networks on sparse, high-dimensional data , 2017, AISTATS.

[28] Alexander M. Rush,et al. Semi-Amortized Variational Autoencoders , 2018, ICML.

[29] Elad Eban,et al. Computationally Efficient Neural Image Compression , 2019, ArXiv.

[30] Jack Xin,et al. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.

[31] Taco S. Cohen,et al. Video Compression With Rate-Distortion Autoencoders , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32] Jooyoung Lee,et al. Context-adaptive Entropy Model for End-to-end Optimized Image Compression , 2018, ICLR.

[33] Stephan Mandt,et al. Deep Generative Video Compression , 2018, NeurIPS.

[34] Pieter Abbeel,et al. Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables , 2019, ICML.

[35] Pieter Abbeel,et al. Compression with Flows via Local Bits-Back Coding , 2019, NeurIPS.

[36] Hedvig Kjellström,et al. Advances in Variational Inference , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] David Barber,et al. Practical Lossless Compression with Latent Variables using Bits Back Coding , 2019, ICLR.

[38] Abdelaziz Djelouah,et al. Content Adaptive Optimization for Neural Image Compression , 2019, CVPR Workshops.

[39] David Barber,et al. HiLLoC: Lossless Image Compression with Hierarchical Latent Variable Models , 2019, ICLR.

[40] S. Mandt,et al. Variational Bayesian Quantization , 2020, ICML.

[41] Stephan Mandt,et al. Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding , 2020, ICML 2020.

[42] Yibo Yang,et al. Hierarchical Autoregressive Modeling for Neural Video Compression , 2020, ICLR.