A Neural Network Model of Spatial Distortion Sensitivity for Video Quality Estimation

Accurate estimation of visual quality as perceived by humans is crucial for modern multimedia systems and, given the evident ease for humans, a surprisingly difficult task for computers. Complexity considerations as imperative for real-time applications render this problem even more challenging. This paper studies the application of a neural network-based spatial model of distortion sensitivity to the quality prediction of spatio-temporal videos. We propose a simple yet effective adaptation of the loss function to cope with saturation effects in human quality ratings. This adaptation drastically decreases the number of iterations necessary for training networks to replicate psychophysical human responses. Our experimental results show significantly improved prediction performance of the spatio-temporal PSNR when compensated for spatial distortion sensitivity while maintaining the advantage of low complexity.

[1]  Heiko Schwarz,et al.  Neural Network Guided Perceptually Optimized Bit-Allocation for Block-Based Image and Video Compression , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[2]  Eric C. Larson,et al.  Most apparent distortion: full-reference image quality assessment and the role of strategy , 2010, J. Electronic Imaging.

[3]  Jong-Seok Lee,et al.  Subjective and Objective Quality Assessment of Compressed 4K UHD Videos for Immersive Experience , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[5]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[6]  J. Lubin A human vision system model for objective picture quality measurements , 1997 .

[7]  Bernd Girod,et al.  What's wrong with mean-squared error? , 1993 .

[8]  Jinwoo Kim,et al.  Deep Video Quality Assessor: From Spatio-Temporal Visual Sensitivity to a Convolutional Neural Aggregation Network , 2018, ECCV.

[9]  Nikolay N. Ponomarenko,et al.  Color image database TID2013: Peculiarities and preliminary results , 2013, European Workshop on Visual Information Processing (EUVIP).

[10]  Sebastian Bosse,et al.  Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment , 2016, IEEE Transactions on Image Processing.

[11]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[12]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[14]  Sebastian Bosse,et al.  Assessing Perceived Image Quality Using Steady-State Visual Evoked Potentials and Spatio-Spectral Decomposition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Sebastian Bosse,et al.  Neural Network-Based Estimation of Distortion Sensitivity for Image Quality Prediction , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[16]  Sebastian Bosse,et al.  Estimation of distortion sensitivity for visual quality prediction using a convolutional neural network , 2019, Digit. Signal Process..

[17]  Lutz Prechelt,et al.  Early Stopping-But When? , 1996, Neural Networks: Tricks of the Trade.

[18]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[20]  Scott J. Daly,et al.  Application of a Noise Adaptive Contrast Sensitivity Function to Image Data Compression , 1989, Photonics West - Lasers and Applications in Science and Engineering.

[21]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.