σ2R Loss: a Weighted Loss by Multiplicative Factors using Sigmoidal Functions

In neural networks, the loss function represents the core of the learning process that leads the optimizer to an approximation of the optimal convergence error. Convolutional neural networks (CNN) use the loss function as a supervisory signal to train a deep model and contribute significantly to achieving the state of the art in some fields of artificial vision. Cross-entropy and Center loss functions are commonly used to increase the discriminating power of learned functions and increase the generalization performance of the model. Center loss minimizes the class intra-class variance and at the same time penalizes the long distance between the deep features inside each class. However, the total error of the center loss will be heavily influenced by the majority of the instances and can lead to a freezing state in terms of intra-class variance. To address this, we introduce a new loss function called sigma squared reduction loss ($\sigma^2$R loss), which is regulated by a sigmoid function to inflate/deflate the error per instance and then continue to reduce the intra-class variance. Our loss has clear intuition and geometric interpretation, furthermore, we demonstrate by experiments the effectiveness of our proposal on several benchmark datasets showing the intra-class variance reduction and overcoming the results obtained with center loss and soft nearest neighbour functions.

[1]  Vishal M. Patel,et al.  Learning Deep Features for One-Class Classification , 2018, IEEE Transactions on Image Processing.

[2]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[3]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[4]  Fei Su,et al.  Contrastive-center loss for deep neural networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[5]  Nan Jiang,et al.  Low-Rank Spectral Learning with Weighted Loss Functions , 2015, AISTATS.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sanjay Chawla,et al.  Anomaly Detection using One-Class Neural Networks , 2018, ArXiv.

[8]  Mei Wang,et al.  Deep Face Recognition: A Survey , 2018, Neurocomputing.

[9]  Larry S. Davis,et al.  Understanding Center Loss Based Network for Image Retrieval with Few Training Data , 2018, ECCV Workshops.

[10]  Jason Weston,et al.  Trading convexity for scalability , 2006, ICML.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Song Bai,et al.  Triplet-Center Loss for Multi-view 3D Object Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Ignazio Gallo,et al.  OCmst: One-class Novelty Detection using Convolutional Neural Network and Minimum Spanning Trees , 2020, Pattern Recognit. Lett..

[15]  Ignazio Gallo,et al.  Dynamic Decision Boundary for One-class Classifiers applied to non-uniformly Sampled Data , 2020, ArXiv.

[16]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Geoffrey E. Hinton,et al.  Analyzing and Improving Representations with the Soft Nearest Neighbor Loss , 2019, ICML.

[18]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).