A Novel Wavefront-Based High Parallel Solution for HEVC Encoding

With a lot of enhanced coding tools introduced, High Efficiency Video Coding (HEVC) achieves significant improvement in coding efficiency at the cost of increased computational complexity. To efficiently reduce the encoding time of HEVC, a wavefront-based high parallel (WHP) solution integrating novel data-level and task-level methods is proposed in this paper. On data level, optimal single-instruction-multiple-data algorithms are designed for the enhanced coding tools, i.e., replacing the multiplication in motion compensation by add and shift operations with reduced instruction cycles, removing the transpose in transform via realignment of coefficients, and minimizing the memory access in sum of absolute difference/sum of squared differences calculation by fully reusing the registers. On task level, a novel inter-frame wavefront (IFW) method is developed by effectively decreasing the dependence of wavefront parallel processing (WPP). In addition, a coding tree block level parallelism analysis method is presented to prove the superior of IFW method compared with other HEVC representative parallel methods. Besides, a three-level thread management scheme is proposed to best exploit the parallelism of IFW method and achieve corresponding encoding speedup. Extensive experimental results show that, the overall WHP solution can bring up to $57.65\times $ , $65.55\times $ , and $88.17\times $ speedup for HEVC encoding of Wide Video Graphics Array, 720p and 1080p standard test sequences, while maintaining the same coding performance as with WPP. The proposed solution is also applied in several leading video companies in China, providing HEVC video service for more than 1.3 million users everyday.

[1]  Jun Sun,et al.  Implementation of HEVC decoder on x86 processors with SIMD optimization , 2012, 2012 Visual Communications and Image Processing.

[2]  Liang-Gee Chen,et al.  Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Yongdong Zhang,et al.  High Efficiency Video Coding: High Efficiency Video Coding , 2014 .

[4]  Florian H. Seitner,et al.  Development of a High-Level Simulation Approach and Its Application to Multicore Video Decoding , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Manuel P. Malumbres,et al.  A Parallel Implementation of H.26L Video Encoder (Research Note) , 2002, Euro-Par.

[6]  Henrique S. Malvar,et al.  Low-complexity transform and quantization in H.264/AVC , 2003, IEEE Trans. Circuits Syst. Video Technol..

[7]  Zhuo Zhao,et al.  Data partition for wavefront parallelization of H.264 video encoder , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[8]  Erik B. van der Tol,et al.  Mapping of H.264 decoding on a multiprocessor architecture , 2003, IS&T/SPIE Electronic Imaging.

[9]  Gary J. Sullivan,et al.  Comparison of the Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC) , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Michael Roitzsch Slice-balancing H.264 video encoding for improved scalability of multicore decoding , 2007, EMSOFT '07.

[11]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[12]  Yen-Kuang Chen,et al.  Implementation of H.264 encoder and decoder on personal computers , 2006, J. Vis. Commun. Image Represent..

[13]  Ben H. H. Juurlink,et al.  Parallel Scalability of Video Decoders , 2009, J. Signal Process. Syst..

[14]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Ben H. H. Juurlink,et al.  Parallel Scalability and Efficiency of HEVC Parallelization Approaches , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Manuel P. Malumbres,et al.  Hierarchical Parallelization of an H.264/AVC Video Encoder , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[17]  Jun Sun,et al.  Efficient SIMD optimization of HEVC encoder over X86 processors , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[18]  Ben H. H. Juurlink,et al.  Parallel H.264 Decoding on an Embedded Multicore Processor , 2009, HiPEAC.

[19]  David Flynn,et al.  HEVC Complexity and Implementation Analysis , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  K. R. Rao,et al.  High Efficiency Video Coding(HEVC) , 2014 .

[21]  Ben H. H. Juurlink,et al.  A QHD-capable parallel H.264 decoder , 2011, ICS '11.