Investigating Nuisances in DCNN-Based Face Recognition

Face recognition “in the wild” has been revolutionized by the deployment of deep learning-based approaches. In fact, it has been extensively demonstrated that deep convolutional neural networks (DCNNs) are powerful enough to overcome most of the limits that affected face recognition algorithms based on hand-crafted features. These include variations in illumination, pose, expression, and occlusion, to mention some. The DCNNs discriminative power comes from the fact that low-and high-level representations are learned directly from the raw image data. As a consequence, we expect the performance of a DCNN to be influenced by the characteristics of the image/video data that are fed to the network, and their preprocessing. In this paper, we present a thorough analysis of several aspects that impact on the use of DCNN for face recognition. The evaluation has been carried out from two main perspectives: the network architecture and the similarity measures used to compare deeply learned features; and the data (source and quality) and their preprocessing (bounding box and alignment). The results obtained on the IARPA Janus Benchmark-A, MegaFace, UMDFaces, and YouTube Faces data sets indicate viable hints for designing, training, and testing DCNNs. Considering the outcomes of the experimental evaluation, we show how competitive performance with respect to the state of the art can be reached even with standard DCNN architectures and pipeline.

[1]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[2]  Xiaoming Liu,et al.  Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition , 2017, IEEE Transactions on Image Processing.

[3]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[4]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[6]  W. W. Bledsoe,et al.  Some Results on Multicategory Pattern Recognition , 1966, J. ACM.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[10]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[11]  Mostafa Mehdipour-Ghazi,et al.  A Comprehensive Analysis of Deep Learning Based Representation for Face Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Carlos D. Castillo,et al.  Triplet probabilistic embedding for face verification and clustering , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[14]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Gérard G. Medioni,et al.  Pose-Aware Face Recognition in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[19]  Dongqing Zhang,et al.  Neural Aggregation Network for Video Face Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Christian Wolf,et al.  Sequential Deep Learning for Human Action Recognition , 2011, HBU.

[22]  Yongxin Yang,et al.  Frankenstein: Learning Deep Face Representations Using Small Data , 2016, IEEE Transactions on Image Processing.

[23]  Yongxin Yang,et al.  Attribute-Enhanced Face Recognition with Neural Tensor Fusion Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Patrick J. Flynn,et al.  To Frontalize or Not to Frontalize: Do We Really Need Elaborate Pre-processing to Improve Face Recognition? , 2016, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Gang Hua,et al.  Eigen-PEP for Video Face Recognition , 2014, ACCV.

[27]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Carlos D. Castillo,et al.  The Do’s and Don’ts for CNN-Based Face Verification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[30]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[31]  Yue Wu,et al.  Learning Pose-Aware Models for Pose-Invariant Face Recognition in the Wild , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[33]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[34]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Carlos D. Castillo,et al.  UMDFaces: An annotated face dataset for training deep networks , 2016, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[36]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[37]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Qiong Cao,et al.  Template Adaptation for Face Verification and Identification , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[39]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[40]  Alberto Del Bimbo,et al.  Investigating Nuisance Factors in Face Recognition with DCNN Representation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  Anil K. Jain,et al.  Face Search at Scale , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.