ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation

We present a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network. ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.01), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bj{\o}negaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM.

[1]  Xiaokang Yang,et al.  Learning a convolutional neural network for fractional interpolation in HEVC inter coding , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[2]  Gustavo de Veciana,et al.  An information fidelity criterion for image quality assessment using natural scene statistics , 2005, IEEE Transactions on Image Processing.

[3]  Dong Liu,et al.  CNN-Based DCT-Like Transform for Image Compression , 2018, MMM.

[4]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Angeliki V. Katsenou,et al.  Low complexity video coding based on spatial resolution adaptation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[6]  David Bull,et al.  A Study of High Frame Rate Video Formats , 2019, IEEE Transactions on Multimedia.

[7]  D. Marpe,et al.  Neural network based intra prediction for video coding , 2018, Optical Engineering + Applications.

[8]  Jörn Ostermann,et al.  Deep learning-based intra prediction mode decision for HEVC , 2016, 2016 Picture Coding Symposium (PCS).

[9]  Itu-T and Iso Iec Jtc Advanced video coding for generic audiovisual services , 2010 .

[10]  Ci Wang,et al.  Down-Sampling Based Video Coding Using Super-Resolution Technique , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Mariana Afonso,et al.  Enhanced Video Compression Based on Effective Bit Depth Adaptation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[12]  Zulin Wang,et al.  Multi-frame Quality Enhancement for Compressed Video , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Weisi Lin,et al.  Adaptive downsampling/upsampling for better video compression at low bit rate , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[14]  Ken Turkowski,et al.  Filters for common resampling tasks , 1990 .

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Mariana Afonso,et al.  Perceptually-inspired super-resolution of compressed videos , 2019, Optical Engineering + Applications.

[17]  Xinfeng Zhang,et al.  Image and Video Compression With Neural Networks: A Review , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Chih-Wei Huang,et al.  Adaptive Downsampling Video Coding With Spatially Scalable Rate-Distortion Modeling , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  T.博雷尔,et al.  Video processing method and video processing device , 2011 .

[20]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[21]  Munchurl Kim,et al.  CNN-based in-loop filtering for coding efficiency improvement , 2016, 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP).

[22]  Damon M. Chandler,et al.  A perceptual quantization strategy for HEVC based on a convolutional neural network trained on natural images , 2015, SPIE Optical Engineering + Applications.

[23]  Chih-Yang Lin,et al.  HEVC Intra Frame Coding Based on Convolutional Neural Network , 2018, IEEE Access.

[24]  Fan Zhang,et al.  Image Quality Assessment by Separately Evaluating Detail Losses and Additive Impairments , 2011, IEEE Transactions on Multimedia.

[25]  Fan Zhang,et al.  A video texture database for perceptual compression and quality assessment , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[26]  Mariana Afonso,et al.  Spatial resolution adaptation framework for video compression , 2018, Optical Engineering + Applications.

[27]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[28]  Bin Li,et al.  Fully Connected Network-Based Intra Prediction for Image Coding , 2018, IEEE Transactions on Image Processing.

[29]  Christopher Conly,et al.  Deep Learning Based HEVC In-Loop Filtering for Decoder Quality Enhancement , 2018, 2018 Picture Coding Symposium (PCS).

[30]  Li Li,et al.  Convolutional Neural Network-Based Fractional-Pixel Motion Compensation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[32]  André Kaup,et al.  Laplace Distribution Based Lagrangian Rate Distortion Optimization for Hybrid Video Coding , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Dionysios I. Reisis,et al.  Reduced Complexity Superresolution for Low-Bitrate Video Compression , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Dong Liu,et al.  Deep Learning-Based Video Coding , 2019, ACM Comput. Surv..

[36]  Xinfeng Zhang,et al.  Enhanced Bi-Prediction With Convolutional Neural Network for High-Efficiency Video Coding , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Alan C. Bovik,et al.  Spatiotemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Gary J. Sullivan,et al.  General Video Coding Technology in Responses to the Joint Call for Proposals on Video Compression With Capability Beyond HEVC , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Mariana Afonso,et al.  SRQM: A Video Quality Metric for Spatial Resolution Adaptation , 2018, 2018 Picture Coding Symposium (PCS).

[41]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Dong Liu,et al.  One-for-All: Grouped Variation Network-Based Fractional Interpolation in Video Coding , 2019, IEEE Transactions on Image Processing.

[43]  Aggelos K. Katsaggelos,et al.  A Resolution Adaptive Video Compression System , 2010, Intelligent Multimedia Communication.

[44]  Mariana Afonso,et al.  A Study of Subjective Video Quality at Various Spatial Resolutions , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[45]  Dong Liu,et al.  Neural network-based arithmetic coding of intra prediction modes in HEVC , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[46]  Zulin Wang,et al.  Enhancing Quality for HEVC Compressed Videos , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  Jie Dong,et al.  Adaptive Downsampling for High-Definition Video Coding , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Mariana Afonso,et al.  Video Compression Based on Spatio-Temporal Resolution Adaptation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[49]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Patrick Le Callet,et al.  CNN-based transform index prediction in multiple transforms framework to assist entropy coding , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[51]  Yun Zhang,et al.  Machine learning based video coding optimizations: A survey , 2020, Inf. Sci..