Mapping and optimizing a software-only real-time mpge-2 video encoder on vliw architectures

Due to its high computational demand, MPEG-2 video coding solutions have been based mainly on custom hardware (ASIC) systems. Such systems lack the flexibility and adaptability of software-based solutions. Achieving real-time MPEG-2 video encoding in software remains to be a major challenge. A typical MPEG-2 encoder performs 20 to 30 GOPS (giga operations per second), which exceeds the capabilities of the most advanced contemporary processors. In this thesis, we have developed and tested a highly optimized, low complexity, highquality MPEG-2 video encoder software based on Texas Instruments' fixed-point TMS320C6201 VLiW (Very Long Instruction Word) processor. First, we developed MPEG-2 video encoder software written in C for the C62x processor platform, however, due to the difference in the processor architecture, optimization and modification are done on the software to ensure the MPEG-2 video encoder runs efficiently in the VLIW architecture. The optimization are done at the assembly language level to maximize the attainable instruction-level parallelism (ILP) of the C62x VLIW architecture. In our experience, optimizations done alone by the optimizing Ccompiler of the C62x could not meet the real-time requirements of MPEG-2. After code remapping and optimization, the resulting MPEG-2 video encoder implementation runs approximately 32 times faster than the original unoptimized MPEG-2 video encoder. Moreover, the current version of the encoder can handle SIF(320x240) video format at 16 frames per second with both I and P pictures, and CCIR-601 (720x480) at 15 frames per second for the I pictures only. Our real-time MPEG-2 encoder has been implemented and tested on the C62x Evaluation Model (EVM) board from TI.

[1]  Edward A. Lee Programmable dsp architectures: part ii , 1988 .

[2]  Paolo Faraboschi,et al.  The latest word in digital and media processing , 1998 .

[3]  Thomas Sikora,et al.  MPEG digital video-coding standards , 1997, IEEE Signal Process. Mag..

[4]  N. Seshan High VelociTI processing [Texas Instruments VLIW DSP architecture] , 1998 .

[5]  M.J. Flynn,et al.  Improving performance for software MPEG players , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[6]  Ruby B. Lee Subword parallelism with MAX-2 , 1996, IEEE Micro.

[7]  Masao Ikekawa,et al.  Real-time software MPEG-1 video decoder design for low-cost, low-power applications , 1996, VLSI Signal Processing, IX.

[8]  Monica S. Lam,et al.  RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines , 1998 .

[9]  Hans-Joachim Stolberg,et al.  Code positioning to reduce instruction cache misses in signal processing applications on multimedia RISC processors , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Kai Hwang,et al.  Advanced computer architecture - parallelism, scalability, programmability , 1992 .

[11]  Jack W. Davidson,et al.  Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation , 1995, MICRO.

[12]  Sun-Yuan Kung,et al.  On architectural styles for multimedia signal processors , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[13]  Faouzi Kossentini,et al.  Efficient MPEG-2 encoding of interlaced video , 1998, Canadian Journal of Electrical and Computer Engineering.

[14]  Takashi Nakayama,et al.  Low-power multimedia RISC , 1995, IEEE Micro.

[15]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[16]  Winfried Gehrke,et al.  Associative controlling of monolithic parallel processor architectures , 1995, IEEE Trans. Circuits Syst. Video Technol..