Interpreting CNN For Low Complexity Learned Sub-Pixel Motion Compensation In Video Coding

Deep learning has shown great potential in image and video compression tasks. However, it brings bit savings at the cost of significant increases in coding complexity, which limits its potential for implementation within practical applications. In this paper, a novel neural network-based tool is presented which improves the interpolation of reference samples needed for fractional precision motion compensation. Contrary to previous efforts, the proposed approach focuses on complexity reduction achieved by interpreting the interpolation filters learned by the networks. When the approach is implemented in the Versatile Video Coding (VVC) test model, up to 4.5% BD-rate saving for individual sequences is achieved compared with the baseline VVC, while the complexity of learned interpolation is significantly reduced compared to the application of full neural network.

[1]  Marta Mrak,et al.  Decision Trees for Complexity Reduction in Video Compression , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[2]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[4]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[5]  Li Li,et al.  Convolutional Neural Network-Based Fractional-Pixel Motion Compensation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Dong Liu,et al.  A convolutional neural network approach for half-pel interpolation in video coding , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[7]  Heiko Schwarz,et al.  Intra Picture Prediction for Video Coding with Neural Networks , 2019, 2019 Data Compression Conference (DCC).

[8]  Jianle Chen,et al.  NSST: Non-separable secondary transforms for next generation video coding , 2016, 2016 Picture Coding Symposium (PCS).

[9]  Dong Liu,et al.  Deep Learning-Based Video Coding: A Review and A Case Study , 2019, ArXiv.

[10]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[11]  Noel E. O'Connor,et al.  End-to-End Conditional GAN-based Architectures for Image Colourisation , 2019, 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP).

[12]  Marta Mrak,et al.  Estimation of Rate Control Parameters for Video Coding Using CNN , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[13]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Kemal Ugur,et al.  Motion Compensated Prediction and Interpolation Filter Design in H.265/HEVC , 2013, IEEE Journal of Selected Topics in Signal Processing.

[15]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[16]  Xinfeng Zhang,et al.  Image and Video Compression With Neural Networks: A Review , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[18]  Jinjia Zhou,et al.  Deep Learning-Based Luma and Chroma Fractional Interpolation in Video Coding , 2019, IEEE Access.