Implementation of multi-standard video decoder on a heterogeneous coarse-grained reconfigurable processor

This paper proposes a task-based hybrid parallel and hybrid pipeline (THPHP) scheme to implement multi-standard video algorithms, including MPEG-2, H.264, and audio video coding standard (AVS), on a heterogeneous coarse-grained reconfigurable processor, called the reconfigurable multimedia system (REMUS). The proposed schemes greatly improve decoding performance and satisfy the real-time requirements of various high-definition (HD) video decoding standards. In THPHP, we propose both a task-based hybrid parallel scheme, in which macro-block (MB)-level, block-level, and sub-block-level decoding tasks are parallelized to improve data processing throughput, and a hybrid pipeline scheme, in which slice-level, MB-level, block-level and sub-block-level computations are pipelined to improve efficiency. Computation-intensive tasks, such as motion compensation, intra prediction, inverse discrete cosine transform, reconstruction, and deblocking filter, are implemented on two reconfigurable processing units, which are the core computing engines of REMUS. Thanks to the proposed schemes, the implementations can achieve H.264 high profile (HP) 1920×1080@30 fps streams, AVS Jizhun profile (JP) 1920×1080@39 fps streams, and MPEG-2 main profile (MP) 1920×1080@41 fps streams when working at 200 MHz frequency. Compared with XPP-III (a commercial reconfigurable processor), when implementing H.264 HD decoding, the performance and energy efficiency on REMUS are improved by 1.81× and 14.3×, respectively.

[1]  Leibo Liu,et al.  A Cycle-Accurate Simulator for a Reconfigurable Multi-Media System , 2010, IEICE Trans. Inf. Syst..

[2]  Jiun-In Guo,et al.  A system architecture exploration on the configurable HW/SW co-design for H.264 video decoder , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[3]  Jürgen Becker,et al.  H. 264 Decoder at HD Resolution on a Coarse Grain Dynamically Reconfigurable Architecture , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[4]  Bingfeng Mei,et al.  Mapping an H.264/AVC decoder onto the ADRES reconfigurable architecture , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[5]  Liang-Gee Chen,et al.  Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[6]  Rudy Lauwereins,et al.  ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[7]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[8]  Leibo Liu,et al.  Hierarchical representation of on-chip context to reduce reconfiguration time and implementation area for coarse-grained reconfigurable architecture , 2013, Science China Information Sciences.

[9]  Roberto Guerrieri,et al.  A Heterogeneous Digital Signal Processor for Dynamically Reconfigurable Computing , 2010, IEEE Journal of Solid-State Circuits.

[10]  Yangyuan Wang The driving force for development of IC and system in future: Reducing the power consumption and improving the ratio of performance to power consumption , 2011, Science China Information Sciences.

[11]  Kue-Hwan Sihn,et al.  Analysis and Parallelization of H.264 decoder on Cell Broadband Engine Architecture , 2007, 2007 IEEE International Symposium on Signal Processing and Information Technology.