Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

In order to be able to deliver today’s voluminous amount of video contents through limited bandwidth channels in a perceptually optimal way, it is important to consider perceptual trade-offs of compression and space-time downsampling protocols. In this direction, we have studied and developed new models of natural video statistics (NVS), which are useful because highquality videos contain statistical regularities that are disturbed by distortions. Specifically, we model the statistics of divisively normalized difference between neighboring frames that are relatively displaced. In an extensive empirical study, we found that those paths of space-time displaced frame differences that provide maximal regularity against our NVS model generally align best with motion trajectories. Motivated by this, we build a new video quality prediction engine that extracts NVS features from displaced frame differences, and combines them in a learned regressor that can accurately predict perceptual quality. As a stringent test of the new model, we apply it to the difficult problem of predicting the quality of videos subjected not only to compression, but also to downsampling in space and/or time. We show that the new quality model achieves state-of-the-art (SOTA) prediction performance compared on the new ETRI-LIVE SpaceTime Subsampled Video Quality (STSVQ) database, which is dedicated to this problem. Downsampling protocols are of high interest to the streaming video industry, given rapid increases in frame resolutions and frame rates.

[1]  Mariana Afonso,et al.  Video Compression Based on Spatio-Temporal Resolution Adaptation , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  D. G. Albrecht,et al.  Cortical neurons: Isolation of contrast gain control , 1992, Vision Research.

[3]  Gustavo de Veciana,et al.  Video Quality Assessment on Mobile Devices: Subjective, Behavioral and Objective Studies , 2012, IEEE Journal of Selected Topics in Signal Processing.

[4]  Alan C. Bovik,et al.  Image information and visual quality , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[6]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[7]  B. Fischer,et al.  Human express saccades: extremely short reaction times of goal directed eye movements , 2004, Experimental Brain Research.

[8]  Qin Huang,et al.  Perceptual Quality Driven Frame-Rate Selection (PQD-FRS) for High-Frame-Rate Video , 2016, IEEE Transactions on Broadcasting.

[9]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[10]  Fan Zhang,et al.  A frame rate dependent video quality metric based on temporal wavelet decomposition and spatiotemporal pooling , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[11]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[12]  Mariana Afonso,et al.  A Study of Subjective Video Quality at Various Spatial Resolutions , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[13]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[14]  Margaret H. Pinson,et al.  Temporal Video Quality Model Accounting for Variable Frame Delay Distortions , 2014, IEEE Transactions on Broadcasting.

[15]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[16]  Eric C. Larson,et al.  Most apparent distortion: full-reference image quality assessment and the role of strategy , 2010, J. Electronic Imaging.

[17]  Martin J. Wainwright,et al.  Scale Mixtures of Gaussians and the Statistics of Natural Images , 1999, NIPS.

[18]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[19]  Eero P. Simoncelli,et al.  Natural signal statistics and sensory gain control , 2001, Nature Neuroscience.

[20]  Alan C. Bovik,et al.  RRED Indices: Reduced Reference Entropic Differencing for Image Quality Assessment , 2012, IEEE Transactions on Image Processing.

[21]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[22]  J. Atick,et al.  Temporal decorrelation: a theory of lagged and nonlagged responses in the lateral geniculate nucleus , 1995 .

[23]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..

[24]  Jongho Kim,et al.  On the space-time statistics of motion pictures. , 2021, Journal of the Optical Society of America. A, Optics, image science, and vision.

[25]  Damon M. Chandler,et al.  A spatiotemporal most-apparent-distortion model for video quality assessment , 2011, 2011 18th IEEE International Conference on Image Processing.

[26]  Bernd Jähne,et al.  The HCI Benchmark Suite: Stereo and Flow Ground Truth with Uncertainties for Urban Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Yue Chen,et al.  An Overview of Core Coding Tools in the AV1 Video Codec , 2018, 2018 Picture Coding Symposium (PCS).

[28]  Angeliki V. Katsenou,et al.  Perceptually-Aligned Frame Rate Selection Using Spatio-Temporal Features , 2018, 2018 Picture Coding Symposium (PCS).

[29]  D. Burr,et al.  Contrast sensitivity at high velocities , 1982, Vision Research.

[30]  David Bull,et al.  A Study of High Frame Rate Video Formats , 2019, IEEE Transactions on Multimedia.

[31]  Praful Gupta,et al.  SpEED-QA: Spatial Efficient Entropic Differencing for Image and Video Quality , 2017, IEEE Signal Processing Letters.

[32]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33]  Bing Zeng,et al.  A new three-step search algorithm for block motion estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[34]  Mark Shelhamer,et al.  Pursuit and saccadic tracking exhibit a similar dependence on movement preparation time , 2006, Experimental Brain Research.

[35]  Alan C. Bovik,et al.  A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[36]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[37]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[38]  C.-C. Jay Kuo,et al.  MCL-V: A streaming video quality assessment database , 2015, J. Vis. Commun. Image Represent..

[39]  Kai Zeng,et al.  Characterizing perceptual artifacts in compressed video streams , 2014, Electronic Imaging.

[40]  Damon M. Chandler,et al.  ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices , 2014, J. Electronic Imaging.

[41]  Fan Zhang,et al.  Image Quality Assessment by Separately Evaluating Detail Losses and Additive Impairments , 2011, IEEE Transactions on Multimedia.

[42]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[44]  Rajiv Soundararajan,et al.  Video Quality Assessment by Reduced Reference Spatio-Temporal Entropic Differencing , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[45]  Jongho Kim,et al.  A Subjective and Objective Study of Space-Time Subsampled Video Quality , 2021, IEEE Transactions on Image Processing.

[46]  Alexander Raake,et al.  AVT-VQDB-UHD-1: A Large Scale Video Quality Database for UHD-1 , 2019, 2019 IEEE International Symposium on Multimedia (ISM).

[47]  Martin J. Wainwright,et al.  Image denoising using scale mixtures of Gaussians in the wavelet domain , 2003, IEEE Trans. Image Process..

[48]  Eero P. Simoncelli,et al.  Natural image statistics and divisive normalization: Modeling nonlinearity and adaptation in cortical neurons , 2002 .

[49]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .

[50]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[51]  Anil K. Jain,et al.  Displacement Measurement and Its Application in Interframe Image Coding , 1981, IEEE Trans. Commun..

[52]  Jong-Seok Lee,et al.  Subjective and Objective Quality Assessment of Compressed 4K UHD Videos for Immersive Experience , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[53]  Debargha Mukherjee,et al.  A Technical Overview of VP9—The Latest Open-Source Video Codec , 2013 .

[54]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.

[55]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[56]  Alan C. Bovik,et al.  Subjective and Objective Quality Assessment of High Frame Rate Videos , 2020, IEEE Access.