Enhanced Machine Learning-based Inter Coding for VVC

In this paper, we propose an enhanced machine learning-based inter coding algorithm for VVC. Conceptually, the reference pictures from the decoded picture butter are processed using a recurrent neural network to generate an artificial reference picture at the time instance of the currently coded picture. The network is trained using a SATD cost function to minimize the bit rate cost for the prediction error rather than the pixel-wise difference. By this we achieved average weighted BD-rate gains of 0.94%. The coding time increased about 5% for the encoder and 300% for the decoder due to the use of a neural network.

[1]  Wenhan Yang,et al.  Deep Reference Generation With Multi-Domain Hierarchical Constraints for Inter Prediction , 2019, IEEE Transactions on Multimedia.

[2]  Jörn Ostermann,et al.  A Comprehensive Video Codec Comparison , 2019, APSIPA Transactions on Signal and Information Processing.

[3]  Xuanqin Mou,et al.  Transform coefficients distribution of the future versatile video coding (VVC) standard , 2018, SPIE/COS Photonics Asia.

[4]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[5]  Seunghyun Cho,et al.  Convolution Neural Network based Video Coding Technique using Reference Video Synthesis , 2018, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[6]  Jörn Ostermann,et al.  Deep learning-based intra prediction mode decision for HEVC , 2016, 2016 Picture Coding Symposium (PCS).

[7]  Jörn Ostermann,et al.  HEVC Inter Coding using Deep Recurrent Neural Networks and Artificial Reference Pictures , 2018, 2019 Picture Coding Symposium (PCS).

[8]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.