Improved Inter-Layer Prediction for the Scalable Extensions of HEVC

Summary form only given. Upon the completion of the single-layer H.265/HEVC, scalable extensions of the H.265/HEVC standard, called Scalable High Efficiency Video Coding (SHVC), are currently under development. Compared to the simulcast solution that simply compresses each layer separately, SHVC offers higher coding efficiency by means of inter-layer prediction which is implemented by inserting inter-layer reference (ILR) pictures generated from reconstructed base layer (BL) pictures into the enhancement layer (EL) decoded picture buffer (DPB) for motion-compensated prediction of the collocated pictures in the EL. If the EL has a higher resolution than that of the BL, the reconstructed BL pictures need to be up-sampled to form the ILR pictures. Given that the ILR picture is generated based on the reconstructed BL picture, its suitability for an efficient inter-layer prediction may be limited due to the following reasons. Firstly, quantization is usually applied when coding the BL pictures. Quantization causes the BL reconstructed texture to contain undesired coding artifacts, such as blocking artifacts, ringing artifacts, and color artifacts. Secondly, in case of spatial scalability, a down-sampling process is used to create the BL pictures. To reduce aliasing, the high frequency information in the video signal is typically removed by the down-sampling process. As a result, the texture information in the ILR picture lacks certain high frequency information. In contrast to the ILR picture, the EL temporal reference pictures contain plentiful high frequency information, which could be extracted to enhance the quality of the ILR picture. To further improve the efficiency of inter-layer prediction, a low pass filter may be applied to the ILR picture to alleviate the quantization noise introduced by the BL coding process. In this paper, an ILR enhancement method is proposed to improve the quality of the ILR picture by combining the high frequency information extracted from the EL temporal reference pictures together with the low frequency information extracted from the ILR picture. Experimental results show that the proposed method can significantly increase the ILR efficiency for EL coding, under the Common Test Condition of SHVC, which defines a number of temporal prediction structures called Random Access (RA), Low-delay B (LD-B) and Low-delay P (LD-P), on average the proposed method provides {Y, U, V} BD-rate (BL+EL) gains of {2.0%, 7.1%, 8.2%}, {2.2%, 6.7%, 7.6%} and {4.0%, 7.4%, 8.4%} for RA, LD-B, and LD-P, respectively, in comparison to the performance of the SHVC reference software SHM-2.0.