MFRNet: A New CNN Architecture for Post-Processing and In-loop Filtering

In this paper, we propose a novel convolutional neural network (CNN) architecture, MFRNet, for post-processing (PP) and in-loop filtering (ILF) in the context of video compression. This network consists of four Multi-level Feature review Residual dense Blocks (MFRBs), which are connected using a cascading structure. Each MFRB extracts features from multiple convolutional layers using dense connections and a multi-level residual learning structure. In order to further improve information flow between these blocks, each of them also reuses high dimensional features from the previous MFRB. This network has been integrated into PP and ILF coding modules for both HEVC (HM 16.20) and VVC (VTM 7.0), and fully evaluated under the JVET Common Test Conditions using the Random Access configuration. The experimental results show significant and consistent coding gains over both anchor codecs (HEVC HM and VVC VTM) and also over other existing CNN-based PP/ILF approaches based on Bjontegaard Delta measurements using both PSNR and VMAF for quality assessment. When MFRNet is integrated into HM 16.20, gains up to 16.0% (BD-rate VMAF) are demonstrated for ILF, and up to 21.0% (BD-rate VMAF) for PP. The respective gains for VTM 7.0 are up to 5.1% for ILF and up to 7.1% for PP.

[1]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[3]  Yun Zhang,et al.  Machine learning based video coding optimizations: A survey , 2020, Inf. Sci..

[4]  Mariana Afonso,et al.  Video Compression Based on Spatio-Temporal Resolution Adaptation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Wen Gao,et al.  Enhanced Motion-Compensated Video Coding With Deep Virtual Reference Frame Generation , 2019, IEEE Transactions on Image Processing.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Tao Lei,et al.  A review of Convolutional-Neural-Network-based action recognition , 2019, Pattern Recognit. Lett..

[9]  Wenhan Yang,et al.  Partition Tree Guided Progressive Rethinking Network for in-Loop Filtering of HEVC , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[10]  Xinfeng Zhang,et al.  Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding , 2019, IEEE Transactions on Image Processing.

[11]  Patrick Le Callet,et al.  CNN-based transform index prediction in multiple transforms framework to assist entropy coding , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[12]  Fan Zhang,et al.  Enhancing VVC Through Cnn-Based Post-Processing , 2020, 2020 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Yun Fu,et al.  Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[14]  Xiaoyun Zhang,et al.  DVC: An End-To-End Deep Video Compression Framework , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Itu-T and Iso Iec Jtc Advanced video coding for generic audiovisual services , 2010 .

[16]  Wen Gao,et al.  Low-Rank-Based Nonlocal Adaptive Loop Filter for High-Efficiency Video Compression , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Dong Liu,et al.  Deep Learning-Based Video Coding: A Review and A Case Study , 2019, ArXiv.

[18]  Guowei Teng,et al.  A CNN-Based Post-Processing Algorithm for Video Coding Efficiency Improvement , 2020, IEEE Access.

[19]  Yu Qiao,et al.  ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks , 2018, ECCV Workshops.

[20]  Dong Liu,et al.  Neural network-based arithmetic coding of intra prediction modes in HEVC , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[21]  Xianming Liu,et al.  Robust Video Super-Resolution with Learned Temporal Dynamics , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Dong Liu,et al.  Convolutional Neural Network-Based Block Up-Sampling for HEVC , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[24]  Mariana Afonso,et al.  Enhanced Video Compression Based on Effective Bit Depth Adaptation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[25]  Wei Dai,et al.  Performance comparison of VVC, AV1 and EVC , 2019, Optical Engineering + Applications.

[26]  Fan Zhang,et al.  BVI-HD: A Video Quality Database for HEVC Compressed and Texture Synthesized Content , 2018, IEEE Transactions on Multimedia.

[27]  Xiaoou Tang,et al.  Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.

[28]  Taco S. Cohen,et al.  Video Compression With Rate-Distortion Autoencoders , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Kyung-Ah Sohn,et al.  Efficient deep neural network for photo-realistic image super-resolution , 2019, Pattern Recognit..

[30]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Kiho Choi,et al.  An Overview of the MPEG-5 Essential Video Coding Standard [Standards in a Nutshell] , 2020, IEEE Signal Processing Magazine.

[32]  Bin Li,et al.  Fully Connected Network-Based Intra Prediction for Image Coding , 2018, IEEE Transactions on Image Processing.

[33]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yoshiyuki Yashima,et al.  Deep Learning-based Transformation Matrix Estimation for Bidirectional Interframe Prediction , 2018, 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE).

[36]  Fan Zhang,et al.  BVI-DVC: A Training Database for Deep Video Compression , 2021, IEEE Transactions on Multimedia.

[37]  Kyung-Ah Sohn,et al.  Photo-realistic Image Super-resolution with Fast and Lightweight Cascading Residual Network , 2019, ArXiv.

[38]  Xinfeng Zhang,et al.  Image and Video Compression With Neural Networks: A Review , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Shu-Tao Xia,et al.  Second-Order Attention Network for Single Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[41]  Mathias Wien,et al.  High Efficiency Video Coding: Coding Tools and Specification , 2014 .

[42]  Angeliki V. Katsenou,et al.  Comparing VVC, HEVC and AV1 using Objective and Subjective Assessments , 2020 .

[43]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Fan Zhang,et al.  Gan-Based Effective Bit Depth Adaptation for Perceptual Video Compression , 2020, 2020 IEEE International Conference on Multimedia and Expo (ICME).

[45]  Shuai Wan,et al.  Attention-Based Dual-Scale CNN In-Loop Filter for Versatile Video Coding , 2019, IEEE Access.

[46]  Eirikur Agustsson,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[47]  Kyung-Ah Sohn,et al.  Fast, Accurate, and, Lightweight Super-Resolution with Cascading Residual Network , 2018, ECCV.

[48]  Xinfeng Zhang,et al.  Enhanced Bi-Prediction With Convolutional Neural Network for High-Efficiency Video Coding , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[49]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Qionghai Dai,et al.  Residual Highway Convolutional Neural Networks for in-loop Filtering in HEVC , 2018, IEEE Transactions on Image Processing.

[53]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[54]  Yiming Li,et al.  Dense Residual Convolutional Neural Network based In-Loop Filter for HEVC , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[55]  K. R. Rao,et al.  High efficiency video coding , 2016, 2016 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[56]  Mariana Afonso,et al.  Perceptually-inspired super-resolution of compressed videos , 2019, Optical Engineering + Applications.

[57]  Fan Zhang,et al.  ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation , 2019, ArXiv.

[58]  Abdelaziz Djelouah,et al.  Neural Inter-Frame Compression for Video Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59]  Dong Liu,et al.  Convolutional Neural Network-Based Arithmetic Coding of DC Coefficients for HEVC Intra Coding , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[60]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[61]  Chih-Yang Lin,et al.  HEVC Intra Frame Coding Based on Convolutional Neural Network , 2018, IEEE Access.

[62]  Damon M. Chandler,et al.  A perceptual quantization strategy for HEVC based on a convolutional neural network trained on natural images , 2015, SPIE Optical Engineering + Applications.

[63]  Feng Wu,et al.  Partition-Aware Adaptive Switching Neural Networks for Post-Processing in HEVC , 2019, IEEE Transactions on Multimedia.

[64]  Zichen Zhang,et al.  U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection , 2020, Pattern Recognit..

[65]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[66]  Steve Branson,et al.  Learned Video Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[67]  Debargha Mukherjee,et al.  An Overview of Coding Tools in AV1: the First Video Codec from the Alliance for Open Media , 2020, APSIPA Transactions on Signal and Information Processing.