论文信息 - An Automatic System for Unconstrained Video-Based Face Recognition

An Automatic System for Unconstrained Video-Based Face Recognition

Although deep learning approaches have achieved performance surpassing humans for still image-based face recognition, unconstrained video-based face recognition is still a challenging task due to large volume of data to be processed and intra/inter-video variations on pose, illumination, occlusion, scene, blur, video quality, etc. In this work, we consider challenging scenarios for unconstrained video-based face recognition from multiple-shot videos and surveillance videos with low-quality frames. To handle these problems, we propose a robust and efficient system for unconstrained video-based face recognition, which is composed of modules for face/fiducial detection, face association, and face recognition. First, we use multi-scale single-shot face detectors to efficiently localize faces in videos. The detected faces are then grouped through carefully designed face association methods, especially for multi-shot videos. Finally, the faces are recognized by the proposed face matcher based on an unsupervised subspace learning approach and a subspace-to-subspace similarity metric. Extensive experiments on challenging video datasets, such as Multiple Biometric Grand Challenge (MBGC), Face and Ocular Challenge Series (FOCS), IARPA Janus Surveillance Video Benchmark (IJB-S) for low-quality surveillance videos and IARPA JANUS Benchmark B (IJB-B) for multiple-shot videos, demonstrate that the proposed system can accurately detect and associate faces from unconstrained videos and effectively learn robust and discriminative features for recognition.

[1] Larry S. Davis,et al. Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Ramakant Nevatia,et al. Robust multi-pose face tracking by multi-stage tracklet association , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[4] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Sixue Gong,et al. Recurrent Embedding Aggregation Network for Video Face Recognition , 2019, ArXiv.

[6] Sixue Gong,et al. Video Face Recognition: Component-wise Feature Aggregation Network (C-FAN) , 2019, 2019 International Conference on Biometrics (ICB).

[7] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[8] Xing Ji,et al. CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Yu Liu,et al. Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Cordelia Schmid,et al. Unsupervised metric learning for face identification in TV video , 2011, 2011 International Conference on Computer Vision.

[11] Fabio Tozeto Ramos,et al. Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[12] Carlos D. Castillo,et al. Deep Learning for Understanding Faces: Machines May Be Just as Good, or Better, than Humans , 2018, IEEE Signal Processing Magazine.

[13] Carlos D. Castillo,et al. Video-Based Face Association and Identification , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[14] Rama Chellappa,et al. Visual tracking and recognition using appearance-adaptive models in particle filters , 2004, IEEE Transactions on Image Processing.

[15] Julianne H. Ayyad,et al. Recognizing people from dynamic and static faces and bodies: Dissecting identity with a fusion approach , 2010, Vision Research.

[16] Andrew Zisserman,et al. Deep Face Recognition , 2015, BMVC.

[17] Andrew Zisserman,et al. Automatic face recognition for film character retrieval in feature-length films , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18] Anil K. Jain,et al. IARPA Janus Benchmark-B Face Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19] Erik Learned-Miller,et al. FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[20] Carlos D. Castillo,et al. Triplet probabilistic embedding for face verification and clustering , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[21] Yuxiao Hu,et al. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[22] Xiaogang Wang,et al. Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[23] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[24] Larry S. Davis,et al. SSH: Single Stage Headless Face Detector , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Rama Chellappa,et al. Dictionary-Based Face Recognition from Video , 2012, ECCV.

[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Ruiping Wang,et al. Manifold Discriminant Analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Rui Caseiro,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence High-speed Tracking with Kernelized Correlation Filters , 2022 .

[30] Dacheng Tao,et al. Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Wen Gao,et al. Manifold–Manifold Distance and its Application to Face Recognition With Image Sets , 2012, IEEE Transactions on Image Processing.

[32] Sander Stuijk,et al. Online multi-face detection and tracking using detector confidence and structured SVMs , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[33] Rama Chellappa,et al. HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Anil K. Jain,et al. IARPA Janus Benchmark - C: Face Dataset and Protocol , 2018, 2018 International Conference on Biometrics (ICB).

[35] Rama Chellappa,et al. Unconstrained face verification using deep CNN features , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[36] Rama Chellappa,et al. Face Association for Videos Using Conditional Random Fields and Max-Margin Markov Networks , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Bruce A. Draper,et al. Overview of the Multiple Biometrics Grand Challenge , 2009, ICB.

[38] Xiaogang Wang,et al. Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Rama Chellappa,et al. Dictionary-Based Face and Person Recognition From Unconstrained Video , 2015, IEEE Access.

[40] Marwan Mattar,et al. Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[41] Dongqing Zhang,et al. Neural Aggregation Network for Video Face Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Carlos D. Castillo,et al. Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition , 2018, ArXiv.

[43] Carlos D. Castillo,et al. Hybrid Dictionary Learning and Matching for Video-based Face Verification , 2019, 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[44] Mubarak Shah,et al. Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45] Junjie Yan,et al. Face detection by structural models , 2014, Image Vis. Comput..

[46] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[48] Alan Edelman,et al. The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[49] Bhiksha Raj,et al. SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Carlos D. Castillo,et al. A Fast and Accurate System for Face Detection, Identification, and Verification , 2018, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[51] Anil K. Jain,et al. IJB–S: IARPA Janus Surveillance Video Benchmark , 2018, 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[52] Shuo Yang,et al. WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Rama Chellappa,et al. A Real-Time Multi-Task Single Shot Face Detector , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[54] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Carlos D. Castillo,et al. UMDFaces: An annotated face dataset for training deep networks , 2016, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[57] Carlos D. Castillo,et al. Unconstrained Still/Video-Based Face Verification with Deep Convolutional Neural Networks , 2016, International Journal of Computer Vision.

[58] Shengcai Liao,et al. Learning Face Representation from Scratch , 2014, ArXiv.

[59] Alice J. O'Toole,et al. A video database of moving faces and people , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] Carlos D. Castillo,et al. An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[61] Andrew Zisserman,et al. Person Spotting: Video Shot Retrieval for Face Sets , 2005, CIVR.

[62] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[63] Anil K. Jain,et al. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64] Tal Hassner,et al. Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.