Ensemble Learning-Based Rate-Distortion Optimization for End-to-End Image Compression

End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose several methods to obtain multiple models. First, we adopt the boosting strategy to train multiple networks with diversity as an ensemble. Second, we train an ensemble of multiple probability distribution models to reduce the distribution gap for efficient entropy coding. Third, we present a geometric transform-based self-ensemble method. The multiple models can be regarded as the multiple coding modes, similar to those in non-deep video coding schemes. We further adopt block-level model/mode selection at the encoder side to pursue rate-distortion optimization, where we use hierarchical block partitioning to improve the adaptation ability. Compared with single-model end-to-end compression, our proposed method improves the compression efficiency significantly, leading to 21% BD-rate reduction on the Kodak dataset, without increasing the decoding complexity. On the other hand, when keeping the same compression efficiency, our method can use much simplified decoding models, where the floating-point operations are reduced by 70%.

[1]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[2]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[3]  Jinyan Li,et al.  Prediction of Taxi Destinations Using a Novel Data Embedding Method and Ensemble Learning , 2020, IEEE Transactions on Intelligent Transportation Systems.

[4]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[5]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Luc Van Gool,et al.  Seven Ways to Improve Example-Based Single Image Super Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Antonio Ortega,et al.  Rate-distortion methods for image and video compression , 1998, IEEE Signal Process. Mag..

[8]  Lei Zhou,et al.  End-to-end Optimized Image Compression with Attention Mechanism , 2019, CVPR Workshops.

[9]  David Minnen,et al.  Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gavin Brown,et al.  Ensemble Learning , 2010, Encyclopedia of Machine Learning and Data Mining.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  L. Kuncheva,et al.  Combining classifiers: Soft computing solutions. , 2001 .

[13]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[14]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[15]  Fan Zhou,et al.  Implicit Dual-Domain Convolutional Network for Robust Color Image Compression Artifact Reduction , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Fabio Roli,et al.  Methods for Designing Multiple Classifier Systems , 2001, Multiple Classifier Systems.

[18]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[19]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[20]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[21]  Luc Van Gool,et al.  Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[23]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[24]  Ning An,et al.  Deep ensemble learning for Alzheimers disease classification , 2019, J. Biomed. Informatics.

[25]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  Luca Benini,et al.  Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[28]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[29]  Mohamed Medhat Gaber,et al.  A genetic algorithm approach to optimising random forests applied to class engineered data , 2017, Inf. Sci..

[30]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[32]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[33]  Bo Yan,et al.  An efficient deep convolutional neural networks model for compressed image deblocking , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[34]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[35]  C. Shannon Coding Theorems for a Discrete Source With a Fidelity Criterion-Claude , 2009 .

[36]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[37]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Zhan Ma,et al.  Practical Stacked Non-local Attention Modules for Image Compression , 2019, CVPR Workshops.

[40]  Michael Durand,et al.  Ensemble learning regression for estimating river discharges using satellite altimetry data: Central Congo River as a Test-bed , 2019, Remote Sensing of Environment.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[43]  Allen Gersho,et al.  Constrained Vector Quantization , 1992 .

[44]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[45]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[46]  Lubomir D. Bourdev,et al.  Real-Time Adaptive Image Compression , 2017, ICML.

[47]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[48]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[49]  Wei Jiang,et al.  Prediction of Hemodialysis Timing Based on LVW Feature Selection and Ensemble Learning , 2018, Journal of Medical Systems.

[50]  Majid Rabbani,et al.  An overview of the JPEG 2000 still image compression standard , 2002, Signal Process. Image Commun..