A Light CNN for Deep Face Representation With Noisy Labels

The volume of convolutional neural network (CNN) models proposed for face recognition has been continuously growing larger to better fit the large amount of training data. When training data are obtained from the Internet, the labels are likely to be ambiguous and inaccurate. This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels. First, we introduce a variation of maxout activation, called max-feature-map (MFM), into each convolutional layer of CNN. Different from maxout activation that uses many feature maps to linearly approximate an arbitrary convex activation function, MFM does so via a competitive relationship. MFM can not only separate noisy and informative signals but also play the role of feature selection between two feature maps. Second, three networks are carefully designed to obtain better performance, meanwhile, reducing the number of parameters and computational costs. Finally, a semantic bootstrapping method is proposed to make the prediction of the networks more consistent with noisy labels. Experimental results show that the proposed framework can utilize large-scale noisy data to learn a Light model that is efficient in computational costs and storage spaces. The learned single network with a 256-D representation achieves state-of-the-art results on various face benchmarks without fine-tuning.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[3]  Rob Fergus,et al.  Learning from Noisy Labels with Deep Neural Networks , 2014, ICLR.

[4]  Shengcai Liao,et al.  Face Recognition by Discriminant Analysis with Gabor Tensor Representation , 2007, ICB.

[5]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[6]  Xiaogang Wang,et al.  Deep Learning Identity-Preserving Face Space , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[8]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[9]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[10]  Dumitru Erhan,et al.  Scalable, High-Quality Object Detection , 2014, ArXiv.

[11]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[12]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[13]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shiguang Shan,et al.  Multi-view Deep Network for Cross-View Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Stan Z. Li,et al.  Shared representation learning for heterogenous face recognition , 2014, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[16]  Zhenan Sun,et al.  DeMeshNet: Blind Face Inpainting for Deep MeshFace Verification , 2016, IEEE Transactions on Information Forensics and Security.

[17]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Thierry Denoeux,et al.  A neural network classifier based on Dempster-Shafer theory , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[19]  Ming-Hsuan Yang,et al.  Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  S. Shan,et al.  VIPLFaceNet: an open source deep face recognition SDK , 2016, Frontiers of Computer Science.

[21]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  Gang Hua,et al.  Eigen-PEP for Video Face Recognition , 2014, ACCV.

[24]  Chang Huang,et al.  Targeting Ultimate Accuracy: Face Recognition via Deep Embedding , 2015, ArXiv.

[25]  Xiaoou Tang,et al.  Surpassing Human-Level Face Verification Performance on LFW with GaussianFace , 2014, AAAI.

[26]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[29]  Fang Zhao,et al.  Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis , 2017, NIPS.

[30]  Tieniu Tan,et al.  Ordinal Measures for Iris Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Ming Yang,et al.  Web-scale training for face identification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Tieniu Tan,et al.  Simultaneous Feature and Sample Reduction for Image-Set Classification , 2016, AAAI.

[36]  Shengcai Liao,et al.  A benchmark study of large-scale unconstrained face recognition , 2014, IEEE International Joint Conference on Biometrics.

[37]  Tony R. Martinez,et al.  Instance Pruning Techniques , 1997, ICML.

[38]  Anil K. Jain,et al.  Unconstrained Face Recognition: Identifying a Person of Interest From a Media Collection , 2014, IEEE Transactions on Information Forensics and Security.

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Carlos D. Castillo,et al.  The Do’s and Don’ts for CNN-Based Face Verification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[41]  Shuicheng Yan,et al.  Toward Large-Population Face Identification in Unconstrained Videos , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[43]  Du-Sik Park,et al.  Rotating your face using multi-task deep neural network , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Chen Lin,et al.  MaPU: A novel mathematical computing architecture , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[45]  Xiaogang Wang,et al.  Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations , 2014, NIPS.

[46]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[47]  Marios Savvides,et al.  NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[48]  Chu-Song Chen,et al.  Face Recognition and Retrieval Using Cross-Age Reference Coding With Cross-Age Celebrity Dataset , 2015, IEEE Transactions on Multimedia.

[49]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[50]  Qiong Cao,et al.  Template Adaptation for Face Verification and Identification , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[51]  Xiaoming Liu,et al.  Multi-Task Convolutional Neural Network for Face Recognition. , 2017 .

[52]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[54]  S. Amari Dynamics of pattern formation in lateral-inhibition type neural fields , 1977, Biological Cybernetics.

[55]  Mubarak Shah,et al.  Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[57]  Dahua Lin,et al.  Hidden Factor Analysis for Age Invariant Face Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[58]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Zhenan Sun,et al.  Pose-Guided Photorealistic Face Rotation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Tieniu Tan,et al.  Learning Invariant Deep Representation for NIR-VIS Face Recognition , 2017, AAAI.

[61]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Gang Wang,et al.  Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-kernel Metric Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[64]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[65]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[66]  Geoffrey E. Hinton,et al.  Learning to Label Aerial Images from Noisy Data , 2012, ICML.

[67]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Bernard Ghanem,et al.  Representation learning with deep extreme learning machines for efficient image set classification , 2016, Neural Computing and Applications.

[69]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Naresh Manwani,et al.  Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[71]  Zhen Cui,et al.  Recurrent Regression for Face Recognition , 2016, ArXiv.

[72]  Bo Jiang,et al.  Image set representation and classification with covariate-relation graph , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[73]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[74]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[75]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Carlos D. Castillo,et al.  UMDFaces: An annotated face dataset for training deep networks , 2016, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[77]  Tieniu Tan,et al.  Transferring deep representation for NIR-VIS heterogeneous face recognition , 2016, 2016 International Conference on Biometrics (ICB).

[78]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[79]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[80]  Shengcai Liao,et al.  The CASIA NIR-VIS 2.0 Face Database , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[81]  Enrico Blanzieri,et al.  Detecting potential labeling errors in microarrays by data perturbation , 2006, Bioinform..

[82]  Bernhard Schölkopf,et al.  Estimating a Kernel Fisher Discriminant in the Presence of Label Noise , 2001, ICML.

[83]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[84]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[85]  Dongqing Zhang,et al.  Neural Aggregation Network for Video Face Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Beata Beigman Klebanov,et al.  Learning with Annotation Noise , 2009, ACL.

[87]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[88]  Anil K. Jain,et al.  IARPA Janus Benchmark-B Face Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).