A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion

Unconstrained face recognition performance evaluations have traditionally focused on Labeled Faces in the Wild (LFW) dataset for imagery and the YouTubeFaces (YTF) dataset for videos in the last couple of years. Spectacular progress in this field has resulted in saturation on verification and identification accuracies for those benchmark datasets. In this paper, we propose a unified learning framework named Transferred Deep Feature Fusion (TDFF) targeting at the new IARPA Janus Benchmark A (IJB-A) face recognition dataset released by NIST face challenge. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the LFW and YTF datasets. Inspired by transfer learning, we train two advanced deep convolutional neural networks (DCNN) with two different large datasets in source domain, respectively. By exploring the complementarity of two distinct DCNNs, deep feature fusion is utilized after feature extraction in target domain. Then, template specific linear SVMs is adopted to enhance the discrimination of framework. Finally, multiple matching scores corresponding different templates are merged as the final results. This simple unified framework exhibits excellent performance on IJB-A dataset. Based on the proposed approach, we have submitted our IJB-A results to National Institute of Standards and Technology (NIST) for official evaluation. Moreover, by introducing new data and advanced neural architecture, our method outperforms the state-of-the-art by a wide margin on IJB-A dataset.

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[4]  Fang Zhao,et al.  Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis , 2017, NIPS.

[5]  Roland Göcke,et al.  Joint Registration and Representation Learning for Unconstrained Face Identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[7]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[8]  Anil K. Jain,et al.  Open source biometric recognition , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[9]  Carlos D. Castillo,et al.  Unconstrained Still/Video-Based Face Verification with Deep Convolutional Neural Networks , 2016, International Journal of Computer Vision.

[10]  Qiong Cao,et al.  Template Adaptation for Face Verification and Identification , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[11]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Diogo Almeida,et al.  Resnet in Resnet: Generalizing Residual Architectures , 2016, ArXiv.

[13]  Hong Wang,et al.  Face Recognition via Active Annotation and Learning , 2016, ACM Multimedia.

[14]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[15]  Tal Hassner,et al.  Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[18]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[21]  Jongmoo Choi,et al.  Pooling Faces: Template Based Face Recognition with Pooled Face Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[24]  Carlos D. Castillo,et al.  Triplet probabilistic embedding for face verification and clustering , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[25]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[27]  Xiaoming Liu,et al.  Representation Learning by Rotating Your Faces , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[29]  Rama Chellappa,et al.  Unconstrained face verification using deep CNN features , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[32]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[33]  Xiaoming Liu,et al.  Coefficients Pose-Variant Input Recogni 8 on Engine Frontalized Output Generator FF-GAN D Discriminator Extreme Pose Input Frontalized Output , 2017 .

[34]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[35]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[36]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[41]  William J. Christmas,et al.  When Face Recognition Meets with Deep Learning: An Evaluation of Convolutional Neural Networks for Face Recognition , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[42]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[43]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[44]  Tara N. Sainath,et al.  Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[46]  Jun-Cheng Chen,et al.  An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[47]  Kari Pulli,et al.  Realtime Computer Vision with OpenCV , 2012, ACM Queue.

[48]  Subhransu Maji,et al.  One-to-many face recognition with bilinear CNNs , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[49]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[50]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[51]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[52]  Anil K. Jain,et al.  Face Search at Scale: 80 Million Gallery , 2015, ArXiv.

[53]  Anton van den Hengel,et al.  Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[54]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Swami Sankaranarayanan,et al.  Triplet Similarity Embedding for Face Verification , 2016, ArXiv.

[56]  Gérard G. Medioni,et al.  Pose-Aware Face Recognition in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Fang Zhao,et al.  Robust Face Recognition with Deep Multi-View Representation Learning , 2016, ACM Multimedia.

[58]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[60]  Dongqing Zhang,et al.  Neural Aggregation Network for Video Face Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).