HEVC/H.265 coding unit split decision using deep reinforcement learning

The video coding community has long been seeking more effective rate-distortion optimization techniques than the widely adopted greedy approach. The difficulty arises when we need to predict how the coding mode decision made in one stage would affect subsequent decisions and thus the overall coding performance. Taking a data-driven approach, we introduce in this paper deep reinforcement learning (RL) as a mechanism for the coding unit (CU) split decision in HEVC/H.265. We propose to regard the luminance samples of a CU together with the quantization parameter as its state, the split decision as an action, and the reduction in ratedistortion cost relative to keeping the current CU intact as the immediate reward. Based on the Q-learning algorithm, we learn a convolutional neural network to approximate the ratedistortion cost reduction of each possible state-action pair. The proposed scheme performs compatibly with the current full rate-distortion optimization scheme in HM-16.15, incurring a 2.5% average BD-rate loss. While also performing similarly to a conventional scheme that treats the split decision as a binary classification problem, our scheme can additionally quantify the rate-distortion cost reduction, enabling more applications.

[1]  Xiaokang Yang,et al.  Fast coding unit depth decision for HEVC , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[2]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[3]  Zhan Ma,et al.  Fast CU partition decision using machine learning for screen content compression , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Junjie Liu,et al.  VLSI friendly fast CU/PU mode decision for HEVC intra encoding: Leveraging convolution neural network , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[6]  Zhan Ma,et al.  Fast Mode and Partition Decision Using Machine Learning for Intra-Frame Coding in HEVC Screen Content Coding Extension , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[7]  Jie Chen,et al.  Fast coding unit size selection for HEVC based on Bayesian decision rule , 2012, 2012 Picture Coding Symposium.

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Guilherme Corrêa,et al.  Classification-based early termination for coding tree structure decision in HEVC , 2014, 2014 21st IEEE International Conference on Electronics, Circuits and Systems (ICECS).

[10]  Lu Yu,et al.  CU splitting early termination based on weighted SVM , 2013, EURASIP Journal on Image and Video Processing.