Spatial resolution adaptation framework for video compression

This paper presents a resolution adaptation framework for video compression. It dynamically applies spatial resampling, trading off the relationship between spatial resolution and quantization. A learning-based Quantization-Resolution Optimization (QRO) module, trained on a large database of video content, determines the optimal spatial resolution among multiple options, based on spatial and temporal video features of the uncompressed video frames. In order to improve the quality of upscaled videos, a modified CNN-based single image super-resolution method is employed at the decoder. This super-resolution model has been trained using compressed content from the same training database. The proposed resolution adaptation framework was integrated with the High Efficiency Video Coding (HEVC) reference software, HM 16.18, and tested on UHD content from several databases including videos from the JVET (Joint Video Exploration Team) test set. Experimental results show that the proposed method offers significant overall bit rate savings for a wide range of bitrates compared with the original HEVC HM 16.18, with average BD-rate savings of 12% (based on PSNR) and 15% (based on VMAF) and lower encoding complexity.

[1]  Angeliki V. Katsenou,et al.  Low complexity video coding based on spatial resolution adaptation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[2]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[3]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Vladimir Naumovich Vapni The Nature of Statistical Learning Theory , 1995 .

[5]  Mariana Afonso,et al.  Video Compression Based on Spatio-Temporal Resolution Adaptation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Wilhelm Burger,et al.  Principles of Digital Image Processing , 2013, Undergraduate Topics in Computer Science.

[7]  Wei Zhang,et al.  The SJTU 4K video sequence dataset , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[8]  Dong Liu,et al.  Convolutional Neural Network-Based Block Up-Sampling for Intra Frame Coding , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Chih-Wei Huang,et al.  Adaptive Downsampling Video Coding With Spatially Scalable Rate-Distortion Modeling , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[12]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[13]  Jie Dong,et al.  Adaptive Downsampling for High-Definition Video Coding , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Xiaoou Tang,et al.  Compression Artifacts Reduction by a Deep Convolutional Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Mariana Afonso,et al.  SRQM: A Video Quality Metric for Spatial Resolution Adaptation , 2018, 2018 Picture Coding Symposium (PCS).

[16]  Ci Wang,et al.  Down-Sampling Based Video Coding Using Super-Resolution Technique , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Dionysios I. Reisis,et al.  Reduced Complexity Superresolution for Low-Bitrate Video Compression , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Angeliki V. Katsenou,et al.  Predicting video rate-distortion curves using textural features , 2016, 2016 Picture Coding Symposium (PCS).