SIMD acceleration for HEVC encoding on DSP

As the new generation video coding standard, High Efficient Video Coding (HEVC) significantly improves the video compression efficiency, which is however at the cost of a far more computational payload than the capacity of real-time video applications and general purpose processors. In this paper, we focus on the SIMD-based fast implementation of the HEVC encoder over modern TI Digital Signal Processors (DSPs). We first test the DSP-based HEVC encoder and indentify the most time-consuming encoding modules. Then SIMD instructions are exploited to improve the parallel computing capacity of these modules and thus speed up the encoder. The experimental results show that the proposed implementations can significantly improve the encoding speed of the DSP-based HEVC encoder, with a speedup ratio of 8.38-87.32 over the original C-based encoder and 1.59–6.56 over o3-optimization enabled encoder.

[1]  Jun Sun,et al.  Novel Efficient HEVC Decoding Solution on General-Purpose Processors , 2014, IEEE Transactions on Multimedia.

[2]  Eduardo Juárez Martínez,et al.  Complexity analysis of an HEVC decoder based on a digital signal processor , 2013, IEEE Transactions on Consumer Electronics.

[3]  Alan D. George,et al.  Optimization and evaluation of image- and signal-processing kernels on the TI C6678 multi-core DSP , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Ajith Pasqual,et al.  4K Real-Time HEVC Decoder on an FPGA , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Nouri Masmoudi,et al.  SAD and SSE implementation for HEVC encoder on DSP TMS320C6678 , 2016, 2016 International Image Processing, Applications and Systems (IPAS).

[6]  Biao Min,et al.  A Fast CU Size Decision Algorithm for the HEVC Intra Encoder , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Yongfei Zhang,et al.  Motion Classification-Based Fast Motion Estimation for High-Efficiency Video Coding , 2017, IEEE Transactions on Multimedia.

[8]  Haibo Wang,et al.  Fast Coding Unit Depth Decision Algorithm for Interframe Coding in HEVC , 2013, 2013 Data Compression Conference.

[9]  Ben H. H. Juurlink,et al.  SIMD Acceleration for HEVC Decoding , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Christos Grecos,et al.  A Parallel HEVC Intra Prediction Algorithm for Heterogeneous CPU+GPU Platforms , 2016, IEEE Transactions on Broadcasting.

[11]  Yong-Jo Ahn,et al.  Implementation of fast HEVC encoder based on SIMD and data-level parallelism , 2014, EURASIP J. Image Video Process..

[12]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Eduardo Juárez,et al.  A multicore DSP HEVC decoder using an actor-based dataflow model , 2015, 2015 IEEE International Conference on Consumer Electronics (ICCE).

[14]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Fumiyo Takano,et al.  Highly parallel transformation and quantization for HEVC encoder on GPUs , 2016, 2016 Visual Communications and Image Processing (VCIP).

[16]  Bin Li,et al.  HEVC Encoding Optimization Using Multicore CPUs and GPUs , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[18]  David Flynn,et al.  HEVC Complexity and Implementation Analysis , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Nuno Roma,et al.  GHEVC: An Efficient HEVC Decoder for Graphics Processing Units , 2017, IEEE Transactions on Multimedia.