Human motion quality assessment toward sophisticated sports scenes based on deeply-learned 3D CNN model

Abstract Video may be subject to various distortions during acquisition, processing, compression, storage, transmission, and reproduction, and it results in reduced visual quality. In complex sports scenes under big data environment, the human body's movements are even more so. The quality of human motion can intuitively affect the human visual experience. Therefore, it is necessary to determine an intelligent quality assessment model to evaluate human motion in complex motion scenarios under big data environment. It can be used to dynamically monitor and adjust video quality, and it can be used for algorithms and parameter settings in motion image processing systems. With the popularity of deep learning, convolutional neural networks have become a very important method in the field of computer vision research. Based on the 2D-CNN algorithm, we propose a 3D convolutional neural network model for human motion quality assessment in complex motion scenarios. The model captures the pose characteristics, motion trajectory, video brightness and contrast in time and space. The model feeds back the reference and distorted video pairs into the network, with each output layer acting as a feature map. The local similarity between the feature maps obtained from the reference video and the distorted video is then calculated and combined to obtain a global image quality score. Experiments show that the model can achieve competitive performance in big data environment for video quality assessment.

[1]  Gustavo de Veciana,et al.  An information fidelity criterion for image quality assessment using natural scene statistics , 2005, IEEE Transactions on Image Processing.

[2]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[3]  Alan C. Bovik,et al.  Objective quality assessment of multiply distorted images , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[4]  Patrick C. Teo,et al.  Perceptual image distortion , 1994, Electronic Imaging.

[5]  Wilson S. Geisler,et al.  Image quality assessment based on a degradation model , 2000, IEEE Trans. Image Process..

[6]  Andrew P. Bradley,et al.  Perceptual quality metrics applied to still image compression , 1998, Signal Process..

[7]  Bernd Girod,et al.  What's wrong with mean-squared error? , 1993 .

[8]  Paul S. Fisher,et al.  Image quality measures and their performance , 1995, IEEE Trans. Commun..

[9]  Yichen Wei,et al.  Compositional Human Pose Regression , 2018, Comput. Vis. Image Underst..

[10]  Stefan Winkler,et al.  Perceptual distortion metric for digital color video , 1999, Electronic Imaging.

[11]  Alan C. Bovik,et al.  Image information and visual quality , 2006, IEEE Trans. Image Process..

[12]  Fei Gao,et al.  DeepSim: Deep similarity for image quality assessment , 2017, Neurocomputing.

[13]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[15]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Soumen Bag,et al.  Shape decomposition-based handwritten compound character recognition for Bangla OCR , 2018, J. Vis. Commun. Image Represent..

[17]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[18]  Yann LeCun,et al.  Toward automatic phenotyping of developing embryos from videos , 2005, IEEE Transactions on Image Processing.

[19]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[20]  Gang Wang,et al.  Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Zhou Wang,et al.  Why is image quality assessment so difficult? , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.