Self-Organized Variational Autoencoders (Self-Vae) For Learned Image Compression

In end-to-end optimized learned image compression, it is standard practice to use a convolutional variational autoencoder with generalized divisive normalization (GDN) to transform images into a latent space. Recently, Operational Neural Networks (ONNs) that learn the best non-linearity from a set of alternatives, and their “self-organized” variants, Self-ONNs, that approximate any non-linearity via Taylor series have been proposed to address the limitations of convolutional layers and a fixed nonlinear activation. In this paper, we propose to replace the convolutional and GDN layers in the variational autoencoder with self-organized operational layers, and propose a novel self-organized variational autoencoder (Self-VAE) architecture that benefits from stronger non-linearity. The experimental results demonstrate that the proposed Self-VAE yields improvements in both rate-distortion performance and perceptual image quality.

[1]  Alexandros Iosifidis,et al.  Self-Organized Operational Neural Networks with Generative Neurons , 2020, Neural Networks.

[2]  David Minnen,et al.  Channel-Wise Autoregressive Entropy Models for Learned Image Compression , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[3]  Lei Zhou,et al.  Variational Autoencoder for Low Bit-rate Image Compression , 2018, CVPR Workshops.

[4]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[5]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[6]  Moncef Gabbouj,et al.  Self-Organized Operational Neural Networks for Severe Image Restoration Problems , 2020, Neural Networks.

[7]  Alexandros Iosifidis,et al.  Progressive Operational Perceptron with Memory , 2018, ArXiv.

[8]  Valero Laparra,et al.  Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[9]  Alexandros Iosifidis,et al.  Generalized model of biological neural networks: Progressive operational perceptrons , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[10]  Jiajun Wu,et al.  Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[11]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[12]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[13]  Alexandros Iosifidis,et al.  Operational neural networks , 2019, Neural Computing and Applications.

[14]  Alexandros Iosifidis,et al.  Progressive Operational Perceptrons , 2017, Neurocomputing.

[15]  Valero Laparra,et al.  End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).

[16]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[17]  Touradj Ebrahimi,et al.  The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[18]  Moncef Gabbouj,et al.  FastONN - Python based open-source GPU implementation for Operational Neural Networks , 2020, ArXiv.

[19]  Alexandros Iosifidis,et al.  Heterogeneous Multilayer Generalized Operational Perceptron , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Jiro Katto,et al.  Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Alexandros Iosifidis,et al.  Knowledge Transfer for Face Verification Using Heterogeneous Generalized Operational Perceptrons , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[24]  Eirikur Agustsson,et al.  Nonlinear Transform Coding , 2020, IEEE Journal of Selected Topics in Signal Processing.