Embedded Real-Time H264/AVC High Definition Video Encoder on TI’s KeyStone Multicore DSP

To overcome high computational complexity of advanced video encoders for emerging applications that require real-time processing, multicore technology can be one of the promising solutions to meet this constraint. In this context, this paper presents a parallel implementation of the H264/AVC high definition (HD) video encoder exploiting the power processing of eight-core digital signal processor (DSP) TMS320C6678. GOP Level Parallelism approach is used to improve the encoding speed and meet the real-time encoding compliant. A master core is reserved to handle data transfer between the DSP and the camera interface via a Gigabit Ethernet link. Multithreading algorithm and ping-pong buffers technique are used to enhance the classic GOP level parallelism approach and hide communication overhead. Experimental results on seven slave DSP cores, running each at 1 GHz, show that our implementation allows performing a real-time HD (1280 × 720) video encoding. The achieved encoding speed is up to 28 f/s. The proposed parallel implementation accelerates the encoding process by a factor of 6.7 without inducing quality degradation in terms of PSNR or bit-rate increase compared to single core implementation. Experiments show that our proposed scheduling technique for hiding communication overhead saves up to 36 % of the fully encoding chain time which includes frames capturing, frames encoding and bitstream saving in a file.

[1]  Yong Ho Song,et al.  Exploring parallelization techniques based on OpenMP in H.264/AVC encoder for embedded multi-core processor , 2012, J. Syst. Archit..

[2]  Stuart D. Walker,et al.  4kUHD H264 Wireless Live Video Streaming Using CUDA , 2014, J. Electr. Comput. Eng..

[3]  Nouri Masmoudi,et al.  Fast Intra Mode Decision Algorithm for H264/AVC HD Baseline Profile Encoder , 2012 .

[4]  Nouri Masmoudi,et al.  Optimizations for real-time implementation of H264/AVC video encoder on DSP processor , 2013 .

[5]  Mohamed Abid,et al.  High Level Optimized Parallel Specification of a H . 264 / AVC Video Encoder , 2011 .

[6]  Manuel P. Malumbres,et al.  Hierarchical Parallelization of an H.264/AVC Video Encoder , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[7]  Shuming Chen,et al.  A Highly Efficient Parallel Algorithm for H.264 Encoder Based on Macro-Block Region Partition , 2007, HPCC.

[8]  Bevan M. Baas,et al.  A fine-grained parallel implementation of a H.264/AVC encoder on a 167-processor computational platform , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[9]  Liang-Gee Chen,et al.  A 1.3TOPS H.264/AVC single-chip encoder for HDTV applications , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[10]  Junaidi Abdullah,et al.  Level Parallelism on H . 264 Video Encoder for Multicore Architecture , .

[11]  Milind Girkar,et al.  Towards efficient multi-level threading of H.264 encoder on Intel hyper-threading architectures , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[12]  Nan Wu,et al.  Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation , 2014, TheScientificWorldJournal.

[13]  Nouri Masmoudi,et al.  Optimal DSP Based Integer Motion Estimation Implementation for H.264/AVC Baseline Encoder , 2010, Int. Arab J. Inf. Technol..

[14]  Timo Hämäläinen,et al.  Parallel implementation of video encoder on quad DSP system , 2002, Microprocess. Microsystems.

[15]  Zhuo Zhao,et al.  A Highly Efficient Parallel Algorithm for H.264 Video Encoder , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16]  Nuno Roma,et al.  p264: open platform for designing parallel H.264/AVC video encoders on multi-core systems , 2010, NOSSDAV '10.

[17]  Fang Ji,et al.  An Algorithm Based on AVS Encoding on FPGA Multi-Core Pipeline , 2013, 2013 International Conference on Computational and Information Sciences.

[18]  Nouri Masmoudi,et al.  Analysis and Optimization of UB Video's H.264 Baseline Encoder Implementation on Texas Instruments' TMS320DM642 DSP , 2006, 2006 International Conference on Image Processing.

[19]  Mohamed Atri,et al.  Efficient smart-camera accelerator: A configurable motion estimator dedicated to video codec , 2013, J. Syst. Archit..

[20]  Junaidi Abdullah,et al.  Performance Optimization of Video Coding Process on Multi-Core Platform Using Gop Level Parallelism , 2013, International Journal of Parallel Programming.

[21]  Shuming Chen,et al.  Mapping of H.264/AVC Encoder on a Hierarchical Chip Multicore DSP Platform , 2010, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC).

[22]  Jo Yew Tham,et al.  Real-time H.264 encoder implementation on a low-power digital signal processor , 2009, 2009 IEEE International Conference on Multimedia and Expo.