论文信息 - A Combined Deep Learning based End-to-End Video Coding Architecture for YUV Color Space

A Combined Deep Learning based End-to-End Video Coding Architecture for YUV Color Space

Most of the existing deep learning based end-to-end video coding (DLEC) architectures are designed specifically for RGB color format, yet the video coding standards, including H.264/AVC, H.265/HEVC and H.266/VVC developed over past few decades, have been designed primarily for YUV 4:2:0 format, where the chrominance (U and V) components are subsampled to achieve superior compression performances considering the human visual system. While a broad number of papers on DLEC compare these two distinct coding schemes in RGB domain, it is ideal to have a common evaluation framework in YUV 4:2:0 domain for a more fair comparison. This paper introduces a new DLEC architecture for video coding to effectively support YUV 4:2:0 and compares its performance against the HEVC standard under a common evaluation framework. The experimental results on YUV 4:2:0 video sequences show that the proposed architecture can outperform HEVC in intra-frame coding, however inter-frame coding is not as efficient on contrary to the RGB coding results reported in recent papers.

[1] William A. Pearlman,et al. Digital Signal Compression: Principles and Practice , 2011 .

[2] Akshay Pushparaja,et al. CompressAI: a PyTorch library and evaluation platform for end-to-end compression research , 2020, ArXiv.

[3] Xiaoyun Zhang,et al. DVC: An End-To-End Deep Video Compression Framework , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Chao-Yuan Wu,et al. Video Compression through Image Interpolation , 2018, ECCV.

[5] Johannes Ballé,et al. Efficient Nonlinear Transforms for Lossy Image Compression , 2018, 2018 Picture Coding Symposium (PCS).

[6] Marta Karczewicz,et al. Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces , 2021, IEEE Open Journal of Signal Processing.

[7] A. Tekalp,et al. End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[8] Jarek Duda,et al. Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding , 2013, 1311.2540.

[9] David Minnen,et al. Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[10] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[11] G. Bjontegaard,et al. Calculation of Average PSNR Differences between RD-curves , 2001 .

[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13] A. Murat Tekalp,et al. Can learned frame prediction compete with block motion compensation for video coding? , 2020, Signal Image Video Process..

[14] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[15] Eirikur Agustsson,et al. Scale-Space Flow for End-to-End Optimized Video Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).