An efficient pipeline execution of H.264/AVC intra 4×4 frame design

In this paper, we present an implementation of an optimized H.264 intra 4×4 algorithm in order to reduce the time of the intra 4×4 process. However the source of waste time in conventional architecture of intra 4×4 is the serialization of intra prediction and reconstruction of sixteen 4×4 blocks in one macroblock and the intra prediction of the current 4×4 block cannot be performed before the reconstruction of the previous 4×4 block. Therefore, for a high speed implementation we replaced the conventional one by a pipelined architecture while maintaining consistency with the standard. So we have studied ten alternative scanning orders based on rearranging order of intra 4×4 and we choose the best one which reduce dependencies between consecutively executed blocks without performance degradation. This order is implemented by a pipelined architecture using VHDL language. The VHDL code is verified to work at 100 MHz in an ALTERA Stratix II EP2S60F1020C3 FPGA. As a result, the processing time is reduced by 31.25% compared to the conventional implementation. So, it can be a good solution for real-time video application. The H.264 intra 4×4 hardware and software are demonstrated to work together on ALTERA NIOS-II development board with Stratix II EP2S60F1020C3 FPGA.

[1]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[2]  Hyuk-Jae Lee,et al.  A Parallel and Pipelined Execution of H.264/AVC Intra Prediction , 2006, The Sixth IEEE International Conference on Computer and Information Technology (CIT'06).

[3]  Kwanghoon Sohn,et al.  VLSI architecture design of motion vector processor for H.264/AVC , 2008, 2008 15th IEEE International Conference on Image Processing.

[4]  Patrice Kadionik,et al.  An Efficient Hardware Architecture Design for H.264/AVC INTRA 4X4 Algorithm , 2008 .