Research on Transcoding of MPEG-2/H.264 Video Compression

Video transcoding performs one or more operations, such as bit-rate and format conversions, to transform one compressed video stream to another. It is one of the essential components for current and future multimedia systems that aim to provide universal access. Transcoding can enable multimedia devices of diverse capabilities and formats to exchange video content on heterogeneous network platforms. To suit available network bandwidth, a video transcoder can perform dynamic adjustments in the bit-rate and frame-rate of the video bit-stream without additional functional requirements in the decoder. In addition, transcoder provides functions of video format conversion to enable content exchange. Currently, one of the biggest business drivers for transcoding is the increase in the number of service providers offering HD (High Definition) content. And the need for cost-effective yet high-quality transmission of HD content will grow significantly. One case of HD transcoding is for video storage. HD application requires huge storage space compared with other video format. Traditionally, HD content is compressed with MPEG-2 standard. If video content in MPEG-2 format can be transcoded to H.264/AVC standard, about 50% storage space can be saved. This dissertation focuses on transcoding from MPEG-2 to H.264/AVC for HDTV application. The MPEG-2 video coding standard (also known as ITU-T H.262), which was developed about ten years ago primarily as an extension of prior MPEG-1 video capability with support of interlaced video coding, was an enabling technology for digital television systems worldwide. It is widely used for the transmission of standard definition (SD) and HDTV signals over satellite, cable, and terrestrial emission and the storage of high-quality SD video signals onto DVDs. ITU-T Recommendation H.264 and ISO/IEC MPEG-4 (Part 10) Advanced Video Coding (or referred to in short as H.264/AVC) is the powerful and state-of-the-art video compression standard developed by the ITU-T/ISO/IEC Joint Video Team (JVT) consisting of experts from ITU-T's Video Coding Experts Group (VCEG) and ISO/IEC's Moving Picture Experts Group (MPEG). H.264/AVC represents a delicate balance between coding gain, implementation complexity, and costs based on state of VLSI (ASICs and Microprocessors) design technology. H.264/AVC design emerged with an improvement in coding efficiency typically by a factor of two over MPEG-2--the most widely used video coding standard today--while keeping the cost within the acceptable range. Transcoding from MPEG-2 to H.264/AVC faces many challenges since the big syntax gap between them. In hardware design, one problem is how to apply data reuse methods to reduce bandwidth. Traditional search window reuse schemes rely on regular overlapping between successive search windows, which is guaranteed by full search block matching algorithms (FSBMA). In MPEG-2 to H.264/AVC transcoding, this regularity is broken up by MPEG-2 motion vector (MV), which is reused as search center in H.264/AVC encoder end. In this dissertation, two search window reuse methods, Level C and Level C+, are proposed for MPEG-2 to H.264/AVC transcoding to achieve various bandwidth levels. A hardware architecture suitable for the proposed Level C scheme is also presented. In addition, a motion vector prediction algorithm for transcoding is proposed to improve the precision of search window position. This dissertation consists of 6 chapters which are as follows: Chapter 1 [Introduction] introduces transcoding functions and existing architectures to implement transcoding systems. Transcoding functions are classified as: homogeneous, heterogenous and additional functions. The homogeneous transcoding performs conversion between video bit-streams from the same standard; The heterogenous transcoding provides conversions between different video coding standards. Additional functions include error resilience and logo/watermarking insertion. The data reuse schemes are also introduced since it plays the most important role in bandwidth reduction for current video coding system design. Two kinds of data locality: locality in current frame and locality in reference frame. Locality in reference frame is further classified into five categories: Level A, Level B, Level C, Level C+ and Level D. Chapter 2 [Level C Scheme for Transcoding] presents a Level C search window reuse scheme for MPEG-2 to H.264 transcoding, especially for HDTV application. The Level C scheme for transcoding is based on the fact that neighboring MPEG-2 MVs often have similar value. If the MV difference between successive MVs is less than a threshold, it is defined as smooth MV field and they are regularized to have fixed interval. That is successive two MBs have 16 pixel difference in x-coordinate. Therefore successive two MBs can share part of search window if MV field is smooth; otherwise search window should be flushed. Since most MB in sequence can be regularized based on the experimental results, a low bandwidth level can be achieved for transcoder combined with the smaller search range introduced the high accuracy by MPEG-2 MV. Experiment results show that the proposed method achieves average 93.1% search window reuse-rate in HDTV720p sequence with almost no video quality degradation. The bandwidth of the proposed scheme can be reduced to 40.6% of the transcoder without any data reuse scheme, which is almost equal to the bandwidth level of regular H.264/AVC encoder with Level C+ scheme. Chapter 3 [Level C+ Scheme for Transcoding] proposes a search window reuse method (Level C+) for MPEG-2 to H.264/AVC transcoding. The proposed method is designed for ultra-low bandwidth application, while the on-chip memory is not a main constraining factor. The ultra-low bandwidth (Rα<2) is required in some practical video transcoding system design because: 1) the strictly limited availability of bandwidth resource in these designs; 2) larger bandwidth also induces higher power dissipation, package cost and increase problems with skew; 3) the size of on-chip memory is not a constraining factor in these designs considering the continually decreased production cost of on-chip memory. Furthermore, from a systematic point of view, memory traffic introduced by other components such as variable-length-codec, DCT/IDCT, and so on must also be considered. Additionally, the motion-estimation process only uses the luminance pixel data, while the other components also use chrominance data thus increasing the importance of strong data-reuse level. By loading search window for the motion estimation unit (MEU) and applying motion vector clipping processing, each MB in MEU can utilize both horizontal and vertical search window reuse in the proposed method. An ultra-low bandwidth level (Rα<2) can be achieved with an acceptable cost of on-chip memory. Chapter 4 [Hardware Architecture for Level C Scheme] proposes a low-bandwidth IME (Integer Motion Estimation) module for MPEG-2 to H.264 transcoder design. The Partial SAD architecture is adopted because partial SAD architecture has smaller gate count and suitable for medium and small resolution videos. Another advantage is that it has shorter critical path delay compared with SAD Tree because partial SAD is stored and propagated by propagation and delay register. Based on Level C bandwidth reduction method for transcoding, a modified ping-pang memory control scheme combined with Partial SAD VBSME architecture is realized. The memory control units must achieve two primary goals: 1) avoid memory input and output confliction; 2) keep IME module to be fully utilized, which means the ME operation must has no stall. These objects are usually achieved by applying ping-pang strategy. That is when one SRAM is used, the other one is updated. The proposed architecture contains four memory banks (Mem 0-3) for storage of reference pixel. Two memories are involved to perform ME of 47×16 reference pixels. Reference pixel for ME operation of each MB is stored in three memory banks. In our design, Mem 0-2 are circularly accessed when MV field is smooth; Mem 3 is used when MV field is non-smooth. Experiment results show bandwidth of the proposed architecture is 70.6% of H.264 regular IME (Level C+ scheme, 2 MB stitched vertically), while the on-chip memory size is 11.7% of that. Chapter 5 [Motion Vector Prediction for Transcoding] presents a hardware-oriented motion vector predictor (MVP) scheme for MPEG-2 to H.264/AVC transcoding. In transcoding, motion estimation is usually not performed in the transcoder because of its computational complexity. Instead, motion vectors extracted from the incoming bit-stream are reused. In many existing works, motion vector is improved by a procedure called motion vector refinement. This method is based on observation that the motion vector deviation in most macroblocks is within a small range and the position of the optimal motion vector will be near that of the incoming motion vector. But this method is difficult to be implemented in hardware. In this dissertation, we show that MVP from neighboring sub-blocks is more accurate than MPEG-2 MV as search center when MPEG-2 MV field is non-smooth. A criterion based on relative motion is proposed to evaluate smoothness of MPEG-2 MV field. And a hardware oriented MV prediction scheme is also proposed based on smoothness of MPEG-2 MV field. Experiment results show that the proposed MV prediction scheme with a relative small search range can approach the performance of full search algorithm. Comparing with the method only utilizing MPEG-2 MV, the proposed approach can achieve significant improvement on accuracy of motion prediction, especially in sequences with fast motion and complicate background. Chapter 6 [Conclusion] summarizes the results of my research, and indicates the future works.

[1]  Yu Hen Hu,et al.  A novel modular systolic array architecture for full-search block matching motion estimation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Liang-Gee Chen,et al.  Level C+ data reuse scheme for motion estimation with corresponding coding orders , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Liang-Gee Chen,et al.  Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Anil K. Jain,et al.  Displacement Measurement and Its Application in Interframe Image Coding , 1981, IEEE Trans. Commun..

[5]  Anthony Vetro,et al.  MPEG-7 Transcoding Hints for Reduced Complexity and Improved Quality , 2001 .

[6]  Wei Li,et al.  A fast block-matching algorithm using smooth motion vector field adaptive search technique , 2008, Journal of Computer Science and Technology.

[7]  Ming-Ting Sun,et al.  Motion Estimation For High Performance Transcoding , 1998, International 1998 Conference on Consumer Electronics.

[8]  Gertjan Keesman,et al.  Transcoding of MPEG bitstreams , 1996, Signal Process. Image Commun..

[9]  Lai-Man Po,et al.  A novel four-step search algorithm for fast block motion estimation , 1996, IEEE Trans. Circuits Syst. Video Technol..

[10]  Ming-Ting Sun,et al.  A family of vlsi designs for the motion compensation block-matching algorithm , 1989 .

[11]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[12]  Itu-T and Iso Iec Jtc Advanced video coding for generic audiovisual services , 2010 .

[13]  Yang Song,et al.  An Irregular Search Window Reuse Scheme for Motion Estimation in MPEG-2 to H.264 Transcoding , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[14]  L. Alparone,et al.  An improved H.263 video coder relying on weighted median filtering of motion vectors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[15]  Guifen Tian,et al.  Bandwidth reduction schemes for MPEG-2 to H.264 transcoder design , 2008, 2008 16th European Signal Processing Conference.

[16]  Michael Stegherr,et al.  Parameterizable VLSI architectures for the full-search block-matching algorithm , 1989 .

[17]  Guifen Tian,et al.  A Hardware-Oriented High Precision Motion Vector Prediction Scheme for MPEG-2 to H.264 Transcoding , 2008, 2008 Congress on Image and Signal Processing.

[18]  Rong Xie,et al.  Efficient MPEG-2 to MPEG-4 compressed video transcoding , 2002, IS&T/SPIE Electronic Imaging.

[19]  Satoshi Goto,et al.  A low bandwidth integer motion estimation module for MPEG-2 to H.264 transcoding , 2008, APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems.

[20]  Ming-Ting Sun,et al.  Bit allocation for joint transcoding of multiple MPEG coded video streams , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[21]  Shyang Chang,et al.  Zero waiting-cycle hierarchical block matching algorithm and its array architectures , 1994, IEEE Trans. Circuits Syst. Video Technol..

[22]  Takeshi Ikenaga,et al.  A-16-11 Level C+ Bandwidth Reduction Method for MPEG-2 to H.264 Transcoding , 2008 .

[23]  Konstantinos Konstantinides,et al.  Image and Video Compression Standards: Algorithms and Architectures , 1997 .

[24]  Jordi Ribas-Corbera,et al.  Windows Media Video 9: overview and applications , 2004, Signal Process. Image Commun..

[25]  Yasuhiro Takishima,et al.  Coding Mode Decision for High Quality MPEG-2 to H.264 Transcoding , 2007, 2007 IEEE International Conference on Image Processing.

[26]  Hai Bing Yin,et al.  Motion Vector Smoothing for True Motion Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[27]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[28]  Award , 2007, The Veterinary record.

[29]  Alexandros Eleftheriadis,et al.  Constrained and general dynamic rate shaping of compressed digital video , 1995, Proceedings., International Conference on Image Processing.

[30]  Jun Xin,et al.  MPEG-2 to H.264/AVC transcoding for efficient storage of broadcast video bitstreams , 2006, 2006 Digest of Technical Papers International Conference on Consumer Electronics.

[31]  Charilaos Christopoulos,et al.  Transcoder architectures for video coding , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[32]  Peter Pirsch,et al.  Array architectures for block matching algorithms , 1989 .

[33]  Jill M. Boyce,et al.  Fast mode decision and motion estimation for H.264 with a focus on MPEG-2/H.264 transcoding , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[34]  P. N. Tudor,et al.  Real-time transcoding of MPEG-2 video bit streams , 1997 .

[35]  Yang Song,et al.  An Irregular Search Window Reuse Scheme for MPEG-2 to H.264 Transcoding , 2008, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[36]  Yeong-Kang Lai,et al.  A data-interlacing architecture with two-dimensional data-reuse for full-search block-matching algorithm , 1998, IEEE Trans. Circuits Syst. Video Technol..

[37]  Anthony Vetro,et al.  Rate-distortion models for video transcoding , 2003, IS&T/SPIE Electronic Imaging.

[38]  Liang-Gee Chen,et al.  Analysis and architecture design of variable block-size motion estimation for H.264/AVC , 2006, IEEE Transactions on Circuits and Systems I: Regular Papers.

[39]  Shih-Fu Chang,et al.  Manipulation and Compositing of MC-DCT Compressed Video , 1995, IEEE J. Sel. Areas Commun..

[40]  Rabab Kreidieh Ward,et al.  Compensation of Requantization and Interpolation Errors in MPEG-2 to H.264 Transcoding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[41]  Satoshi Goto,et al.  A VLSI architecture design of an edge based fast intra prediction mode decision algorithm for h.264/avc , 2007, GLSVLSI '07.

[42]  Susie Wee Reversing motion vector fields , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[43]  Anthony Vetro,et al.  Object-based transcoding for adaptable video content delivery , 2001, IEEE Trans. Circuits Syst. Video Technol..

[44]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..

[45]  Jun Yu,et al.  Video transcoding for fast forward/reverse video playback , 2002, Proceedings. International Conference on Image Processing.

[46]  Susie J. Wee,et al.  Compressed-domain reverse play of MPEG video streams , 1999, Other Conferences.

[47]  Pedro Cuenca,et al.  Reducing Motion Estimation Complexity in MPEG-2 TO H.264 Transcoding , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[48]  Itu-T Video coding for low bitrate communication , 1996 .

[49]  Yu Sun,et al.  Video transcoding: an overview of various techniques and research issues , 2005, IEEE Transactions on Multimedia.

[50]  Shih-Fu Chang Optimal Video Adaptation and Skimming Using a Utility-Based Framework , 2002 .

[51]  Mohammed Ghanbari,et al.  Heterogeneous Video Transcoding to Lower Spatio-Temporal Resolutions and Different Encoding Formats , 2000, IEEE Trans. Multim..

[52]  Chaur-Heh Hsieh,et al.  VLSI architecture for block-matching motion estimation algorithm , 1992, IEEE Trans. Circuits Syst. Video Technol..

[53]  Mohammed Ghanbari,et al.  Transcoding architectures for DCT-domain heterogeneous video transcoding , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[54]  Anthony Vetro,et al.  Motion and Mode Mapping for MPEG-2 to H.264/AVC Transcoding , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[55]  Satoshi Goto,et al.  An Ultra-Low Bandwidth Design Method for MPEG-2 to H.264/AVC Transcoding , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[56]  Chein-Wei Jen,et al.  On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture , 2002, IEEE Trans. Circuits Syst. Video Technol..

[57]  Zhi Zhou,et al.  Motion information and coding mode reuse for MPEG-2 to H.264 transcoding , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[58]  Anthony Vetro,et al.  Efficient MPEG-2 to H.264/AVC intra transcoding in transform-domain , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[59]  Yeong-Kang Lai,et al.  A novel scalable architecture with memory interleaving organization for full search block-matching algorithm , 1997, Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97.

[60]  Mohammed Ghanbari,et al.  A frequency-domain video transcoder for dynamic bit-rate reduction of MPEG-2 bit streams , 1998, IEEE Trans. Circuits Syst. Video Technol..

[61]  Alan C. Bovik,et al.  Local bandwidth constrained fast inverse motion compensation for DCT-domain video transcoding , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[62]  Mohammed Ghanbari,et al.  Post-processing of MPEG2 coded video for transmission at lower bit rates , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[63]  Satoshi Goto,et al.  A Motion Vector Prediction Scheme for MPEG-2 to H.264 Transcoding Based on Smoothness of Motion Vector Field , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[64]  R. Srinivasan,et al.  Predictive Coding Based on Efficient Motion Estimation , 1985, IEEE Trans. Commun..