SeqFace: Learning discriminative features by using face sequences

Funding information National Natural Science Foundation of China, Grant/Award Number: 61871413; Fundamental Research Funds for the Central Universities, Grant/Award Number: XK2020-03 Abstract Deep convolutional neural networks (CNNs) have greatly improved the Face Recognition (FR) performance in recent years. Almost all CNNs in FR are trained on the carefully labeled datasets containing plenty of identities. However, such high-quality datasets are very expensive to collect, which restricts many researchers to achieve state-of-the-art performance. In this paper, a framework, called SeqFace, for learning discriminative face features is proposed. Besides a traditional identity training dataset, the designed SeqFace can train CNNs by using an additional dataset which includes a large number of face sequences collected from videos. Moreover, the label smoothing regularization (LSR) and a new proposed discriminative sequence agent (DSA) loss are employed to enhance the discrimination power of deep face features via making full use of the sequence data. Only with a single ResNet model, the method achieves very competitive performance on several face recognition benchmarks, including LFW, YTF, CFP, AgeDB, and MegaFace. The code and model are publicly available at the website https://github.com/huangyangyu/SeqFace.

[1]  Shiguang Shan,et al.  Discriminative Covariance Oriented Representation Learning for Face Recognition with Image Sets , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[4]  Dacheng Tao,et al.  Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Eric Granger,et al.  Video-based face recognition using ensemble of haar-like deep convolutional neural networks , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[11]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ming-Hsuan Yang,et al.  Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[16]  Xiao Zhang,et al.  Range Loss for Deep Face Recognition with Long-Tailed Training Data , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Jian Cheng,et al.  NormFace: L2 Hypersphere Embedding for Face Verification , 2017, ACM Multimedia.

[18]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  M. Saquib Sarfraz,et al.  Self-Supervised Learning of Face Representations for Video Face Clustering , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Marios Savvides,et al.  Ring Loss: Convex Feature Normalization for Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Robert Sabourin,et al.  Dynamic ensembles of exemplar-SVMs for still-to-video face recognition , 2017, Pattern Recognit..

[23]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[24]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Carlos D. Castillo,et al.  Frontal to profile face verification in the wild , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[27]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[28]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[30]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[31]  Fei Su,et al.  Contrastive-center loss for deep neural networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[32]  Stefanos Zafeiriou,et al.  Marginal Loss for Deep Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[34]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[35]  Shiguang Shan,et al.  A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database , 2015, IEEE Transactions on Image Processing.

[36]  Stefanos Zafeiriou,et al.  AgeDB: The First Manually Collected, In-the-Wild Age Database , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Junmo Kim,et al.  Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shifeng Zhang,et al.  Mis-classified Vector Guided Softmax Loss for Face Recognition , 2019, AAAI.

[40]  Sanja Fidler,et al.  Video Face Clustering With Unknown Number of Clusters , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Ken-ichi Maeda,et al.  Face recognition using temporal image sequence , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[42]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Reza Olfati-Saber,et al.  Kalman-Consensus Filter : Optimality, stability, and performance , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[46]  Wei Wu,et al.  AM-LFS: AutoML for Loss Function Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[48]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[49]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[50]  Shifeng Zhang,et al.  Loss Function Search for Face Recognition , 2020, ICML.

[51]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[52]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.