Chroma Intra Prediction With Attention-Based CNN Architectures

Neural networks can be used in video coding to improve chroma intra-prediction. In particular, usage of fully-connected networks has enabled better cross-component prediction with respect to traditional linear models. Nonetheless, state-of-the-art architectures tend to disregard the location of individual reference samples in the prediction process. This paper proposes a new neural network architecture for cross-component intra-prediction. The network uses a novel attention module to model spatial relations between reference and predicted samples. The proposed approach is integrated into the Versatile Video Coding (VVC) prediction pipeline. Experimental results demonstrate compression gains over the latest VVC anchor compared with state-of-the-art chroma intra-prediction methods based on neural networks.

[1]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[2]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[3]  Dong Liu,et al.  A Hybrid Neural Network for Chroma Intra Prediction , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[4]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[7]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).