A Deep Learning Based Perceptual Bit Allocation Scheme on Conversational Videos for HEVC \lambda -Domain Rate Control

The newest \(\lambda \)-domain rate control in High efficiency video coding (HEVC) adaptively allocates bit per pixel (bpp) without considering the visual importance. A perceptual bit allocation scheme based on deep learning for conversational videos is proposed. Firstly, a multitask cascaded convolutional network is employed to detected facial region in the encoding videos. Then, instead of the predefined bit ratio for frame level bit allocation, an adaptively bit rate ratio, mainly according to variation of inter frames, is used to allocate bit rate more reasonable for every frame. Finally, the quality parameters (QP), belonging to the facial regions, will be clipped in a smaller interval to get a better visual quality. The experimental results show that the quality of facial region is improved significantly with a good rate control performance.

[1]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Houqiang Li,et al.  $\lambda $ Domain Rate Control Algorithm for High Efficiency Video Coding , 2014, IEEE Transactions on Image Processing.

[3]  Jian Sun,et al.  Joint Cascade Face Detection and Alignment , 2014, ECCV.

[4]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[5]  Shengxi Li,et al.  Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face , 2014, IEEE Journal of Selected Topics in Signal Processing.

[6]  Dong-Gyu Sim,et al.  Pixel-Wise Unified Rate-Quantization Model for Multi-Level Rate Control , 2013, IEEE Journal of Selected Topics in Signal Processing.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Touradj Ebrahimi,et al.  Perceptual Video Compression: A Survey , 2012, IEEE Journal of Selected Topics in Signal Processing.

[9]  Chih-Wei Tang,et al.  Spatiotemporal Visual Considerations for Video Coding , 2007, IEEE Transactions on Multimedia.

[10]  Shengxi Li,et al.  Weight-based R-λ rate control for perceptual HEVC coding on conversational videos , 2015, Signal Process. Image Commun..