Highly Paralleled Low-Cost Embedded HEVC Video Encoder on TI KeyStone Multicore DSP

Although HEVC, the emerging video coding standard, has doubled the coding performance of its predecessor H.264/AVC, its significantly increased computational complexity imposes great obstacles for HEVC encoders to be employed in real-time applications with embedded processors, such as digital signal processors (DSPs). In this paper, a TI Keystone multicore TMS320C6678 DSP-based highly paralleled low-cost fast HEVC encoding solution is well designed and implemented. First, the overall structure of HEVC encoder with CTU-level parallelism is re-designed to well support the encoding parallelism, with full consideration of the hardware characteristics. Second, a low-delay and low-memory multicore data transmission mechanism is proposed to reduce the latency of data access between internal L2 memory and external DDR3. Third, the encoding bottlenecks, i.e., the most time-consuming encoding modules, are identified and optimized for acceleration with TI powerful C6000 SIMD instructions. Experimental results show that our proposed HEVC encoder on TI TMS320C6678 DSPs can significantly improve the real-time capacity with tolerable performance loss, 0.93 dB performance loss under on average 465.50 times speedup as compared to CPU-based HM reference software, more specifically, which makes it desirable in power-constrained real-time video applications.

[1]  David Flynn,et al.  HEVC Complexity and Implementation Analysis , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Hyojin Choi,et al.  Algorithm and Software Optimization of Variable Block Size Motion Estimation for H.264/AVC on a VLIW–SIMD DSP , 2008, J. Signal Process. Syst..

[3]  Bo Li,et al.  An efficient Markov chain-based data prefetching for motion estimation of HEVC on multi-core DSPs , 2015, Multimedia Tools and Applications.

[4]  Zhe Li,et al.  Gradient-based fast decision for intra prediction in HEVC , 2012, 2012 Visual Communications and Image Processing.

[5]  Nouri Masmoudi,et al.  Embedded Real-Time H264/AVC High Definition Video Encoder on TI’s KeyStone Multicore DSP , 2017, J. Signal Process. Syst..

[6]  Gang Wang,et al.  SIMD acceleration for HEVC encoding on DSP , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[7]  Hui Liu,et al.  Implementation and Optimization of H.264 Encoder Based on TMS320DM6467 , 2012 .

[8]  Yanfei Shen,et al.  Optimization of H.264/AVC Video Coding Based on DSP Platform , 2012 .

[9]  Mokhtar Nibouche,et al.  Porting a H264/AVC Adaptive in Loop Deblocking Filter to a TI DM6437EVM DSP , 2012, ICISP.

[10]  Mickaël Raulet,et al.  A DSP-Based HEVC decoder implementation using an actor language dataflow model , 2013, IEEE Transactions on Consumer Electronics.

[11]  Jun Sun,et al.  Novel Efficient HEVC Decoding Solution on General-Purpose Processors , 2014, IEEE Transactions on Multimedia.

[12]  Yongfei Zhang,et al.  Motion Classification-Based Fast Motion Estimation for High-Efficiency Video Coding , 2017, IEEE Transactions on Multimedia.

[13]  Tao Zhang,et al.  Fast Intra-Mode and CU Size Decision for HEVC , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Eduardo Juárez Martínez,et al.  Complexity analysis of an HEVC decoder based on a digital signal processor , 2013, IEEE Transactions on Consumer Electronics.

[15]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Chung-Yu Yang,et al.  H.264/AVC Video Encoder Realization and Acceleration on TI DM642 DSP , 2009, PSIVT.

[17]  Byeungwoo Jeon,et al.  Fast Quantization Method With Simplified Rate–Distortion Optimized Quantization for an HEVC Encoder , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Minhua Zhou,et al.  HEVC Deblocking Filter , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Grzegorz Pastuszak,et al.  Algorithm and Architecture Design of the H.265/HEVC Intra Encoder , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Satoshi Goto,et al.  An 8K H.265/HEVC Video Decoder Chip With a New System Pipeline Design , 2017, IEEE Journal of Solid-State Circuits.

[21]  Jaehwan Joo,et al.  Exploration of Practical HEVC/H.265 Sample Adaptive Offset Encoding Policies , 2015, IEEE Signal Processing Letters.

[22]  Benno Stabernack,et al.  FPGA implementation of a full HD real-time HEVC main profile decoder , 2014, IEEE Transactions on Consumer Electronics.

[23]  Ning Wang,et al.  Fast rate distortion optimized quantization for HEVC , 2015, 2015 Visual Communications and Image Processing (VCIP).

[24]  Eduardo Juárez,et al.  On an implementation of HEVC video decoders with DSP technology , 2013, 2013 IEEE International Conference on Consumer Electronics (ICCE).

[25]  Kemal Ugur,et al.  Intra Coding of the HEVC Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Eduardo Juárez Martínez,et al.  An H.264 video decoder based on a latest generation DSP , 2009, IEEE Transactions on Consumer Electronics.

[27]  Chia-Yang Tsai,et al.  Sample Adaptive Offset in the HEVC Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Nouri Masmoudi,et al.  Real-time H.264/AVC baseline decoder implementation on TMS320C6416 , 2010, Journal of Real-Time Image Processing.

[29]  Bin Li,et al.  HEVC Encoding Optimization Using Multicore CPUs and GPUs , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Eduardo Juárez,et al.  A multicore DSP HEVC decoder using an actorbased dataflow model and OpenMP , 2015, IEEE Transactions on Consumer Electronics.

[31]  Ajith Pasqual,et al.  4K Real-Time HEVC Decoder on an FPGA , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Nouri Masmoudi,et al.  Implementation and Optimization of an Enhanced PWD Metric for H.264/AVC on a TMS320C64 DSP , 2012, J. Signal Process. Syst..

[33]  Pablo Rodriguez,et al.  A DSP-based HEVC decoder implementation using RVC-CAL and native OpenHEVC code , 2015, 2015 International Symposium on Consumer Electronics (ISCE).

[34]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[35]  Yui-Lam Chan,et al.  Adaptive Search Range for HEVC Motion Estimation Based on Depth Information , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[36]  Haibo Wang,et al.  Fast Coding Unit Depth Decision Algorithm for Interframe Coding in HEVC , 2013, 2013 Data Compression Conference.

[37]  Gang Wang,et al.  Multidirectional parabolic prediction-based interpolation-free sub-pixel motion estimation , 2017, Signal Process. Image Commun..

[38]  Ben H. H. Juurlink,et al.  SIMD Acceleration for HEVC Decoding , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Chin-Feng Lai,et al.  An Adaptive Mode Decision Algorithm Based on Video Texture Characteristics for HEVC Intra Prediction , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Shihwa Lee,et al.  DSP based programmable FHD HEVC decoder , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).