Improving Face Recognition by Exploring Local Features with Visual Attention

Over the past several years, the performance of state-of-the-art face recognition systems has been significantly improved, due in a large part to the increasing amount of available face datasets and the proliferation of deep neural networks. This rapid increase in performance has left existing popular performance evaluation protocols, such as standard LFW, nearly saturated and has motivated the emergence of new, more challenging protocols (aimed specifically towards unconstrained face recognition). In this work, we employ the use of parts-based face recognition models to further improve the performance of state-of-the-art face recognition systems as evaluated by both the LFW protocol, and the newer, more challenging protocols (BLUFR, IJB-A, and IJB-B). In particular, we employ spatial transformers to automatically localize discriminative facial parts which enables us to build an end-to-end network where global features and local features are fused together, making the final feature representation more discriminative. Experimental results, using these discriminative features, on the BLUFR, IJB-A and IJB-B protocols, show that the proposed approach is able to boost performance of state-of-the-art face recognition systems. The pro-posed approach is not limited to one architecture but can also be applied to other face recognition networks.

[1]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[4]  Liming Chen,et al.  DeepVisage: Making Face Recognition Simple Yet With Powerful Generalization Skills , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[5]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[7]  Junjie Yan,et al.  Face detection by structural models , 2014, Image Vis. Comput..

[8]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[9]  Bo Huang,et al.  Toward End-to-End Face Recognition Through Alignment Learning , 2017, IEEE Signal Processing Letters.

[10]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[12]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[13]  Luc Van Gool,et al.  Face Detection without Bells and Whistles , 2014, ECCV.

[14]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[15]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Lei Zhang,et al.  One-shot Face Recognition by Promoting Underrepresented Classes , 2017, ArXiv.

[20]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[21]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[22]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[25]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[27]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Anil K. Jain,et al.  IARPA Janus Benchmark-B Face Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Shengcai Liao,et al.  A benchmark study of large-scale unconstrained face recognition , 2014, IEEE International Joint Conference on Biometrics.