论文信息 - Improving Deep Video Compression by Resolution-adaptive Flow Coding

Improving Deep Video Compression by Resolution-adaptive Flow Coding

In the learning based video compression approaches, it is an essential issue to compress pixel-level optical flow maps by developing new motion vector (MV) encoders. In this work, we propose a new framework called Resolution-adaptive Flow Coding (RaFC) to effectively compress the flow maps globally and locally, in which we use multi-resolution representations instead of single-resolution representations for both the input flow maps and the output motion features of the MV encoder. To handle complex or simple motion patterns globally, our frame-level scheme RaFC-frame automatically decides the optimal flow map resolution for each video frame. To cope different types of motion patterns locally, our block-level scheme called RaFC-block can also select the optimal resolution for each local block of motion features. In addition, the rate-distortion criterion is applied to both RaFC-frame and RaFC-block and select the optimal motion coding mode for effective flow coding. Comprehensive experiments on four benchmark datasets HEVC, VTL, UVG and MCL-JCV clearly demonstrate the effectiveness of our overall RaFC framework after combing RaFC-frame and RaFC-block for video compression.

[1] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2] Abdelaziz Djelouah,et al. Neural Inter-Frame Compression for Video Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3] David Minnen,et al. Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] David Minnen,et al. Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[5] Feng Wu,et al. Learning for Video Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[6] Michael W. Marcellin,et al. JPEG2000: standard for interactive imaging , 2002, Proc. IEEE.

[7] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[8] Ping Wang,et al. MCL-JCV: A JND-based H.264/AVC video quality assessment dataset , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[9] David Zhang,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Ajay Luthra,et al. Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[11] Steve Branson,et al. Learned Video Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[13] Gregory K. Wallace,et al. The JPEG still picture compression standard , 1991, CACM.

[14] Jan Kautz,et al. Learning Binary Residual Representations for Domain-specific Video Streaming , 2018, AAAI.

[15] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[16] Thomas Brox,et al. FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[18] Taco S. Cohen,et al. Video Compression With Rate-Distortion Autoencoders , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[20] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Xiaoou Tang,et al. LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[23] Xiaoyun Zhang,et al. DVC: An End-To-End Deep Video Compression Framework , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Michael J. Black,et al. Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Zulin Wang,et al. Reducing Complexity of HEVC: A Deep Learning Approach , 2017, IEEE Transactions on Image Processing.

[26] Chao-Yuan Wu,et al. Video Compression through Image Interpolation , 2018, ECCV.

[27] Luc Van Gool,et al. Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[29] Li Chen,et al. Content Adaptive and Error Propagation Aware Deep Video Compression , 2020, ECCV.

[30] Li Chen,et al. An End-to-End Learning Framework for Video Compression , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] David Minnen,et al. Variational image compression with a scale hyperprior , 2018, ICLR.

[32] Jiajun Wu,et al. Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.