LinCos-Softmax: Learning Angle-Discriminative Face Representations With Linearity-Enhanced Cosine Logits

In recent years, the angle-based softmax losses have significantly improved the performance of face recognition whereas these loss functions are all based on cosine logit. A potential weakness is that the nonlinearity of the cosine function may undesirably saturate the angular optimization between the features and the corresponding weight vectors, thereby preventing the network from fully learning to maximize the angular discriminability of features. As a result, the generalization of learned features may be compromised. To tackle this issue, we propose a Linear-Cosine Softmax Loss (LinCos-Softmax) to more effectively learn angle-discriminative facial features. The main characteristic of the loss function we propose is the use of an approximated linear logit. Compared with the conventional cosine logit, it has a stronger linear relationship with the angle on enhancing angular discrimination through Taylor expansion. We also propose an automatic scale parameter selection scheme, which can conveniently provide an appropriate scale for different logits without the need for exhaustive parameter search to improve performance. In addition, we propose a margin-enhanced Linear-Cosine Softmax Loss (m-LinCos-Softmax) to further enlarge inter-class distances and reduce intra-class variations. Experimental results on several face recognition benchmarks (LFW, AgeDB-30, CFP-FP, MegaFace Challenge 1) demonstrate the effectiveness of the proposed method and its superiority to existing angular softmax loss variants.

[1]  Lanqing He,et al.  Softmax Dissection: Towards Understanding Intra- and Inter-clas Objective for Embedding Learning , 2020, AAAI.

[2]  Jian Cheng,et al.  NormFace: L2 Hypersphere Embedding for Face Verification , 2017, ACM Multimedia.

[3]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[4]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[5]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[7]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jianfeng Zhan,et al.  Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks , 2017, ICANN.

[9]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[10]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Carlos D. Castillo,et al.  Frontal to profile face verification in the wild , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[14]  Xiangyu Zhu,et al.  AdaptiveFace: Adaptive Margin and Sampling for Face Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Weilin Huang,et al.  Deep Metric Learning with Hierarchical Triplet Loss , 2018, ECCV.

[17]  Jiansheng Chen,et al.  Rethinking Feature Distribution for Loss Functions in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Yu Liu,et al.  Rethinking Feature Discrimination and Polymerization for Large-scale Recognition , 2017, ArXiv.

[19]  Atsuto Maki,et al.  Feature Contraction: New ConvNet Regularization in Image Classification , 2018, BMVC.

[20]  Xiaogang Wang,et al.  AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Stefanos Zafeiriou,et al.  Marginal Loss for Deep Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Haifeng Shen,et al.  Virtual Class Enhanced Discriminative Embedding Learning , 2018, NeurIPS.

[23]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yu Qiao,et al.  A Comprehensive Study on Center Loss for Deep Face Recognition , 2019, International Journal of Computer Vision.

[26]  Shifeng Zhang,et al.  Mis-classified Vector Guided Softmax Loss for Face Recognition , 2019, AAAI.

[27]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[28]  Chang Huang,et al.  Targeting Ultimate Accuracy: Face Recognition via Deep Embedding , 2015, ArXiv.

[29]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Marios Savvides,et al.  Ring Loss: Convex Feature Normalization for Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[35]  Jian Cheng,et al.  Additive Margin Softmax for Face Verification , 2018, IEEE Signal Processing Letters.

[36]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Stefanos Zafeiriou,et al.  AgeDB: The First Manually Collected, In-the-Wild Age Database , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[40]  Jun Li,et al.  Deep Face Recognition with Center Invariant Loss , 2017, ACM Multimedia.

[41]  Xiao Zhang,et al.  Range Loss for Deep Face Recognition with Long-Tailed Training Data , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Kai Zhao,et al.  RegularFace: Deep Face Recognition via Exclusive Regularization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Jiwen Lu,et al.  UniformFace: Learning Deep Equidistributed Representation for Face Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ignazio Gallo,et al.  Git Loss for Deep Face Recognition , 2018, BMVC.

[45]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).