How to Assess the Quality of Compressed Surveillance Videos Using Face Recognition

Video surveillance plays an important role in public security. To store the growing volume of surveillance videos, video compression is beneficial for reducing video volume; however, it is simultaneously harmful to the video quality. Video quality assessment (VQA) methods help to achieve a tradeoff between the data volume and perceptual quality of compressed surveillance videos. Generally speaking, surveillance video quality assessment (SVQA) is different from conventional VQA, because surveillance videos are usually used for specific tasks, e.g., pedestrian recognition, rather than for entertainment purposes. Therefore, in this paper, we propose two full-reference SVQA methods based on the concept of quality of recognition. We first design two new tasks, distorted face verification (DFV) and distorted face identification (DFI), based on which we further propose two SVQA methods, DFV-SVQA and DFI-SVQA, and corresponding quality metrics. The core components of the DFV-SVQA and DFI-SVQA methods are feature extractors (a DFV model and a DFI model), which we construct using convolutional-neural-network-based face recognition models. In addition, we construct a real-world surveillance video data set, based on which we analyze how various factors, including the video codec, compression level, face resolution, and light intensity, affect the quality of compressed surveillance videos. We find that, compared with conventional VQA methods, our methods are more effective in measuring the quality of surveillance videos while maintaining an acceptable time efficiency.

[1]  Konstantinos N. Plataniotis,et al.  Regularized discriminant analysis for the small sample size problem in face recognition , 2003, Pattern Recognit. Lett..

[2]  Andrew Zisserman,et al.  Fisher Vector Faces in the Wild , 2013, BMVC.

[3]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[7]  Fan Zhang,et al.  A Perception-Based Hybrid Model for Video Quality Assessment , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  King Ngi Ngan,et al.  Video quality assessment by decoupling additive impairments and detail losses , 2011, 2011 Third International Workshop on Quality of Multimedia Experience.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Lucjan Janowski,et al.  Quality Assessment for a Licence Plate Recognition Task Based on a Video Streamed in Limited Networking Conditions , 2011, MCSS.

[13]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[14]  Siwei Ma,et al.  Framework of AVS2-video coding , 2013, 2013 IEEE International Conference on Image Processing.

[15]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[17]  Søren Forchhammer,et al.  Quality assessment of compressed video for automatic license plate recognition , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[18]  Chang Huang,et al.  Targeting Ultimate Accuracy: Face Recognition via Deep Embedding , 2015, ArXiv.

[19]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[20]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[21]  Tingting Jiang,et al.  Surveillance Video Quality Assessment Based on Face Recognition , 2017, ACM Multimedia.

[22]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[23]  Jian Sun,et al.  Face Alignment via Regressing Local Binary Features , 2016, IEEE Transactions on Image Processing.

[24]  Zhou Wang,et al.  Video quality assessment based on structural distortion measurement , 2004, Signal Process. Image Commun..

[25]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[28]  Wei Tsang Ooi,et al.  Video quality for face detection, recognition, and tracking , 2011, TOMCCAP.

[29]  Alan C. Bovik,et al.  No-reference image blur assessment using multiscale gradient , 2009, QOMEX 2009.

[30]  Lucjan Janowski,et al.  Quality assessment for a visual and automatic license plate recognition , 2012, Multimedia Tools and Applications.

[31]  Ljiljana Platisa,et al.  Content-aware objective video quality assessment , 2016, J. Electronic Imaging.

[32]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jonghyun Choi,et al.  Face Identification Using Large Feature Sets , 2012, IEEE Transactions on Image Processing.

[34]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[36]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[37]  Vladimir S. Petrovic,et al.  Objective assessment of surveillance video quality , 2012 .

[38]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Peter Schelkens,et al.  Qualinet White Paper on Definitions of Quality of Experience , 2013 .

[40]  Wei Tsang Ooi,et al.  Critical video quality for distributed automated video surveillance , 2005, MULTIMEDIA '05.