SSIM-Motivated Two-Pass VBR Coding for HEVC

We propose a structural similarity (SSIM)-motivated two-pass variable bit rate control algorithm for High Efficiency Video Coding. Given a bit rate budget, the available bits are optimally allocated at group of pictures (GoP), frame, and coding unit (CU) levels by hierarchically constructing a perceptually uniform space with an SSIM-inspired divisive normalization mechanism. The Lagrange multiplier <inline-formula> <tex-math notation="LaTeX">$\lambda$ </tex-math></inline-formula>, which controls the tradeoff between perceptual distortion and bit rate, is adopted as the GoP level complexity measure. To derive <inline-formula> <tex-math notation="LaTeX">$\lambda$ </tex-math></inline-formula>, Laplacian distribution-based rate and perceptual distortion models are established after the first pass encoding, and the target bits are dynamically allocated by maintaining a uniform Lagrange multiplier level for each GoP through <inline-formula> <tex-math notation="LaTeX">$\lambda$ </tex-math></inline-formula> equalization. Within each GoP, rate control is further performed at frame and CU levels based on SSIM-inspired divisive normalization, aiming to transform the prediction residuals into a perceptually uniform space. Experiments show that the proposed scheme achieves high accuracy rate control and superior rate-SSIM performance, which is further verified by subjective visual testing.

[1]  Bin Li,et al.  Rate-Distortion Optimized Reference Picture Management for High Efficiency Video Coding , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Wen Gao,et al.  Rate-distortion analysis for H.264/AVC video coding and its application to rate control , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Jeong-Hoon Park,et al.  Block Partitioning Structure in the HEVC Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Loren Merritt,et al.  X264: A HIGH PERFORMANCE H.264/AVC ENCODER , 2006 .

[5]  Davoud Fani,et al.  A frame level fuzzy video rate controller for variable bit rate applications of HEVC , 2016, J. Intell. Fuzzy Syst..

[6]  Abdul Rehman,et al.  SSIM-Inspired Perceptual Video Coding for HEVC , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[7]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[8]  Miska M. Hannuksela,et al.  Semi-Fuzzy Rate Controller for Variable Bit Rate Video , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Lai-Man Po,et al.  A New Rate-Distortion Optimization Using Structural Information in H.264 I-Frame Encoder , 2005, ACIVS.

[10]  Jun Sun,et al.  Rate-Distortion Analysis of Dead-Zone Plus Uniform Threshold Scalar Quantization and Its Application—Part II: Two-Pass VBR Coding for H.264/AVC , 2013, IEEE Transactions on Image Processing.

[11]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[12]  Zhou Wang,et al.  On the Mathematical Properties of the Structural Similarity Index , 2012, IEEE Transactions on Image Processing.

[13]  Wen Gao,et al.  SSIM-inspired divisive normalization for perceptual video coding , 2011, 2011 18th IEEE International Conference on Image Processing.

[14]  Babu Hemanth Kumar Aswathappa,et al.  Rate-distortion optimization using structural information in H.264 strictly Intra-frame encoder , 2010, 2010 42nd Southeastern Symposium on System Theory (SSST).

[15]  Wen-Nung Lie,et al.  Two-pass rate-distortion optimized rate control technique for H.264/AVC video , 2005, Visual Communications and Image Processing.

[16]  King Ngi Ngan,et al.  A two-pass rate control algorithm for H.264/AVC high definition video coding , 2009, Signal Process. Image Commun..

[17]  Homer H. Chen,et al.  Improving video coding quality by perceptual rate-distortion optimization , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[18]  Robert W. Heath,et al.  Rate Bounds on SSIM Index of Quantized Images , 2008, IEEE Transactions on Image Processing.

[19]  J A Solomon,et al.  Model of visual contrast gain control and pattern masking. , 1997, Journal of the Optical Society of America. A, Optics, image science, and vision.

[20]  J. M. Foley,et al.  Human luminance pattern-vision mechanisms: masking experiments require a new model. , 1994, Journal of the Optical Society of America. A, Optics, image science, and vision.

[21]  Sanjit K. Mitra,et al.  A linear source model and a unified rate control algorithm for DCT video coding , 2002, IEEE Trans. Circuits Syst. Video Technol..

[22]  Wen Gao,et al.  SSIM-Motivated Rate-Distortion Optimization for Video Coding , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Zhihai He,et al.  Linear Rate Control and Optimum Statistical Multiplexing for H.264 Video Broadcast , 2008, IEEE Transactions on Multimedia.

[24]  Pamela C. Cosman,et al.  Competitive Equilibrium Bitrate Allocation for Multiple Video Streams , 2010, IEEE Transactions on Image Processing.

[25]  Kai Zeng,et al.  Objective Quality Assessment and Perceptual Compression of Screen Content Images , 2018, IEEE Computer Graphics and Applications.

[26]  Oscar C. Au,et al.  Perceptual rate control for low-delay video communications , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[27]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Lai-Man Po,et al.  Improved Inter Prediction based on Structural Similarity in H.264 , 2007, 2007 IEEE International Conference on Signal Processing and Communications.

[29]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[30]  Wen Gao,et al.  Rate-SSIM optimization for video coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Wen Gao,et al.  Perceptual Video Coding Based on SSIM-Inspired Divisive Normalization , 2013, IEEE Transactions on Image Processing.

[32]  Zhou Wang,et al.  Reduced-Reference Image Quality Assessment Using Divisive Normalization-Based Image Representation , 2009, IEEE Journal of Selected Topics in Signal Processing.

[33]  Chun-Ling Yang,et al.  Improved best prediction mode(s) selection methods based on structural similarity in H.264 I-frame encoder , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[34]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[35]  Yi-Hsin Huang,et al.  Predictive Lagrange Multiplier Selection for Perceptual Rate-Distortion Optimization , 2009 .

[36]  Homer H. Chen,et al.  Perceptual Rate-Distortion Optimization Using Structural Similarity Index as Quality Metric , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Stefano Tubaro,et al.  Minimum Variance Optimal Rate Allocation for Multiplexed H.264/AVC Bitstreams , 2008, IEEE Transactions on Image Processing.

[38]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[39]  Zhou Wang,et al.  Video quality assessment based on structural distortion measurement , 2004, Signal Process. Image Commun..

[40]  André Kaup,et al.  Laplace Distribution Based Lagrangian Rate Distortion Optimization for Hybrid Video Coding , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[41]  Zhou Wang,et al.  Modern Image Quality Assessment , 2006, Modern Image Quality Assessment.

[42]  Alan C. Bovik,et al.  Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[43]  Wen Gao,et al.  Rate-GOP Based Rate Control for High Efficiency Video Coding , 2013, IEEE Journal of Selected Topics in Signal Processing.

[44]  Eero P. Simoncelli,et al.  A model of neuronal responses in visual area MT , 1998, Vision Research.

[45]  Tihao Chiang,et al.  A new rate control scheme using quadratic rate distortion model , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[46]  Antonio Ortega,et al.  Bit allocation for dependent quantization with applications to multiresolution and MPEG video coders , 1994, IEEE Trans. Image Process..

[47]  Lai-Man Po,et al.  A Novel Motion Estimation Method Based on Structural Similarity for H.264 Inter Prediction , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[48]  Jing Yang,et al.  A joint rate control scheme for H.264 encoding of multiple video sequences , 2005, IEEE Transactions on Consumer Electronics.

[49]  Homer H. Chen,et al.  Perceptual-based coding mode decision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[50]  Kai Zeng,et al.  Display device-adapted video quality-of-experience assessment , 2015, Electronic Imaging.

[51]  Lai-Man Po,et al.  An SSIM-optimal H.264/AVC inter frame encoder , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[52]  Oscar C. Au,et al.  An Analytic Framework for Frame-Level Dependent Bit Allocation in Hybrid Video Coding , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[53]  Wen Gao,et al.  Laplace distribution based CTU level rate control for HEVC , 2013, 2013 Visual Communications and Image Processing (VCIP).

[54]  Jordi Ribas-Corbera,et al.  A generalized hypothetical reference decoder for H.264/AVC , 2003, IEEE Trans. Circuits Syst. Video Technol..

[55]  Thrasyvoulos N. Pappas,et al.  Structural Similarity Quality Metrics in a Coding Context: Exploring the Space of Realistic Distortions , 2006, IEEE Transactions on Image Processing.

[56]  Siwei Ma,et al.  A study on the rate distortion modeling for High Efficiency Video Coding , 2012, 2012 19th IEEE International Conference on Image Processing.

[57]  Antonio Ortega Optimization techniques for adaptive quantization of image and video under delay constraints , 1994 .

[58]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[59]  Homer H. Chen,et al.  A perceptual-based approach to bit allocation for H.264 encoder , 2010, Visual Communications and Image Processing.