Image and Video Coding/Transcoding: A Rate Distortion Approach

Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain. RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard. Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms—a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding—with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding

[1]  Toby Berger,et al.  Fixed-slope universal lossy data compression , 1997, IEEE Trans. Inf. Theory.

[2]  En-Hui Yang,et al.  Distortion program-size complexity with respect to a fidelity criterion and rate-distortion function , 1993, IEEE Trans. Inf. Theory.

[3]  D. E. Pearson,et al.  Transmission and display of pictorial information , 1975 .

[4]  David L. Neuhoff,et al.  Optimal bit allocations for lossless video coders: motion vectors vs. difference frames , 1995, Proceedings., International Conference on Image Processing.

[5]  Masumi Ishikawa,et al.  Structural learning with forgetting , 1996, Neural Networks.

[6]  Kannan Ramchandran,et al.  Rate-distortion optimal fast thresholding with complete JPEG/MPEG decoder compatibility , 1994, IEEE Trans. Image Process..

[7]  Sanjit K. Mitra,et al.  A unified rate-distortion analysis framework for transform coding , 2001, IEEE Trans. Circuits Syst. Video Technol..

[8]  En-Hui Yang,et al.  Joint Optimization of Run-Length Coding, Huffman Coding, and Quantization Table With Complete Baseline JPEG Decoder Compatibility , 2009, IEEE Transactions on Image Processing.

[9]  G.G. Langdon,et al.  Data compression , 1988, IEEE Potentials.

[10]  En-Hui Yang,et al.  An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Spatial Resolution Conversion , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[11]  Roberto Castagno,et al.  A method for motion adaptive frame rate up-conversion , 1996, IEEE Trans. Circuits Syst. Video Technol..

[12]  En-Hui Yang,et al.  Rate Distortion Optimization for H.264 Interframe Coding: A General Framework and Algorithms , 2007, IEEE Transactions on Image Processing.

[13]  Hyun Wook Park,et al.  Arbitrary-ratio image resizing using fast DCT of composite length for DCT-based transcoder , 2006, IEEE Trans. Image Process..

[14]  Sanjit K. Mitra,et al.  Rate-distortion optimized mode selection for very low bit rate video coding and the emerging H.263 standard , 1996, IEEE Trans. Circuits Syst. Video Technol..

[15]  Narendra Ahuja,et al.  A fast scheme for image size change in the compressed domain , 2001, IEEE Trans. Circuits Syst. Video Technol..

[16]  Wolfgang Effelsberg,et al.  Video compression techniques , 1998 .

[17]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Yücel Altunbasak,et al.  Frame bit allocation for H.264 using Cauchy-distribution based source modelling , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20]  En-Hui Yang,et al.  Soft Decision Quantization for H.264 With Main Profile Compatibility , 2009, IEEE Trans. Circuits Syst. Video Technol..

[21]  En-Hui Yang,et al.  On joint optimization of motion compensation, quantization and baseline entropy coding in H.264 with complete decoder compatibility , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[22]  Masahide Kaneko,et al.  Improvements of Transform Coding Algorithm for Motion-Compensated Interframe Prediction Errors-DCT/SQ Coding , 1987, IEEE J. Sel. Areas Commun..

[23]  曾剑分,et al.  Method, system and software product for color image encoding , 2005 .

[24]  Zhen Zhang,et al.  Variable-Rate Trellis Source Encoding , 1998, IEEE Trans. Inf. Theory.

[25]  John C. Kieffer,et al.  A survey of the theory of source coding with a fidelity criterion , 1993, IEEE Trans. Inf. Theory.

[26]  Harvey J. Everett Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources , 1963 .

[27]  Fa-Long Luo,et al.  Applied neural networks for signal processing , 1997 .

[28]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[29]  Philip A. Chou,et al.  Entropy-constrained vector quantization , 1989, IEEE Trans. Acoust. Speech Signal Process..

[30]  Gary J. Sullivan,et al.  Rate-constrained coder control and comparison of video coding standards , 2003, IEEE Trans. Circuits Syst. Video Technol..

[31]  Hocine Cherifi,et al.  On the distribution of the DCT coefficients , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Iain E. G. Richardson,et al.  Digital Video Communications , 1997 .

[33]  Bernd Girod,et al.  The Efficiency of Motion-Compensating Prediction for Hybrid Coding of Video Sequences , 1987, IEEE J. Sel. Areas Commun..

[34]  Kannan Ramchandran,et al.  Joint thresholding and quantizer selection for transform image coding: entropy-constrained analysis and applications to baseline JPEG , 1997, IEEE Trans. Image Process..

[35]  Kyoung Won Lim,et al.  Improvement on image transform coding by reducing interblock correlation , 1995, IEEE Trans. Image Process..

[36]  Wei Ding,et al.  Rate control of MPEG video coding and recording by rate-quantization modeling , 1996, IEEE Trans. Circuits Syst. Video Technol..

[37]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[38]  En-Hui Yang,et al.  Simple universal lossy data compression schemes derived from the Lempel-Ziv algorithm , 1996, IEEE Trans. Inf. Theory.

[39]  Vasudev Bhaskaran,et al.  A fast approximate algorithm for scaling down digital images in the DCT domain , 1995, Proceedings., International Conference on Image Processing.

[40]  Shih-Fu Chang,et al.  Manipulation and Compositing of MC-DCT Compressed Video , 1995, IEEE J. Sel. Areas Commun..

[41]  Hsueh-Ming Hang,et al.  Source model for transform video coder and its application. II. Variable frame rate coding , 1997, IEEE Trans. Circuits Syst. Video Technol..

[42]  Jerry D. Gibson,et al.  An algorithm for uniform vector quantizer design , 1984, IEEE Trans. Inf. Theory.

[43]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[44]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[45]  Hsueh-Ming Hang,et al.  Source model for transform video coder and its application. I. Fundamental theory , 1997, IEEE Trans. Circuits Syst. Video Technol..

[46]  Bernd Girod,et al.  Efficiency analysis of multihypothesis motion-compensated prediction for video coding , 2000, IEEE Trans. Image Process..

[47]  Markus Flierl,et al.  Rate-constrained multihypothesis prediction for motion-compensated video compression , 2002, IEEE Trans. Circuits Syst. Video Technol..

[48]  Antonio Ortega,et al.  Bit allocation for dependent quantization with applications to multiresolution and MPEG video coders , 1994, IEEE Trans. Image Process..

[49]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[50]  En-Hui Yang,et al.  Down-Sampling Design in DCT Domain With Arbitrary Ratio for Image/Video Transcoding , 2009, IEEE Transactions on Image Processing.

[51]  John D. Villasenor,et al.  Trellis-based R-D optimal quantization in H.263+ , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[52]  Vivek K. Goyal Transform coding with integer-to-integer transforms , 2000, IEEE Trans. Inf. Theory.

[53]  R. Gray Source Coding Theory , 1989 .

[54]  Antonio Ortega,et al.  Rate-distortion methods for image and video compression , 1998, IEEE Signal Process. Mag..

[55]  Jacob Ziv,et al.  On universal quantization , 1985, IEEE Trans. Inf. Theory.

[56]  Hyun Wook Park,et al.  L=M -Fold Image Resizing in Block-DCT Domain Using Symmetric Convolution , 2001 .

[57]  Toby Berger,et al.  Rate distortion theory : a mathematical basis for data compression , 1971 .

[58]  Herbert Gish,et al.  Asymptotically efficient quantizing , 1968, IEEE Trans. Inf. Theory.

[59]  Bernd Girod,et al.  Motion-compensating prediction with fractional-pel accuracy , 1993, IEEE Trans. Commun..

[60]  Hyun Wook Park,et al.  Design and analysis of an image resizing filter in the block-DCT domain , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[61]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[62]  Neri Merhav,et al.  Fast algorithms for DCT-domain image downsampling and for inverse motion compensation , 1997, IEEE Trans. Circuits Syst. Video Technol..

[63]  Yair Shoham,et al.  Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[64]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[65]  K. R. Rao,et al.  An overview of H.264/MPEG-4 Part 10 , 2003, Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No.03EX667).

[66]  Lap-Pui Chau,et al.  The realization of arbitrary downsizing video transcoding , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[67]  Thomas Wiegand,et al.  Lagrange multiplier selection in hybrid video coder control , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[68]  Markus Flierl,et al.  Generalized B pictures and the draft H.264/AVC video-compression standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[69]  Aaron D. Wyner,et al.  The rate-distortion function for source coding with side information at the decoder , 1976, IEEE Trans. Inf. Theory.

[70]  Bernd Girod,et al.  Distributed Video Coding , 2005, Proceedings of the IEEE.

[71]  Jarice Hanson,et al.  Understanding video applications,impact,and theory , 1987 .

[72]  Alexandros Eleftheriadis,et al.  2-D transform-domain resolution translation , 2000, IEEE Trans. Circuits Syst. Video Technol..

[73]  Allen Gersho,et al.  On the structure of vector quantizers , 1982, IEEE Trans. Inf. Theory.

[74]  David L. Neuhoff,et al.  Optimizing motion-vector accuracy in block-based video coding , 2001, IEEE Trans. Circuits Syst. Video Technol..

[75]  R. A. McDonald,et al.  Noiseless Coding of Correlated Information Sources , 1973 .

[76]  En-Hui Yang,et al.  Rate Distortion Optimization of H.264 with Main Profile Compatibility , 2006, 2006 IEEE International Symposium on Information Theory.

[77]  Heiko Schwarz,et al.  Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[78]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .