Max-margin Class Imbalanced Learning with Gaussian Affinity

Real-world object classes appear in imbalanced ratios. This poses a significant challenge for classifiers which get biased towards frequent classes. We hypothesize that improving the generalization capability of a classifier should improve learning on imbalanced datasets. Here, we introduce the first hybrid loss function that jointly performs classification and clustering in a single formulation. Our approach is based on an `affinity measure' in Euclidean space that leads to the following benefits: (1) direct enforcement of maximum margin constraints on classification boundaries, (2) a tractable way to ensure uniformly spaced and equidistant cluster centers, (3) flexibility to learn multiple class prototypes to support diversity and discriminability in feature space. Our extensive experiments demonstrate the significant performance improvements on visual classification and verification tasks on multiple imbalanced datasets. The proposed loss can easily be plugged in any deep architecture as a differentiable block and demonstrates robustness against different levels of data imbalance and corrupted labels.

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Zhuowen Tu,et al.  Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree , 2015, AISTATS.

[4]  Gerald Schaefer,et al.  Cost-sensitive decision tree ensembles for effective imbalanced classification , 2014, Appl. Soft Comput..

[5]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[6]  Ming Yang,et al.  Web-scale training for face identification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[8]  Martial Hebert,et al.  Learning to Model the Tail , 2017, NIPS.

[9]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[10]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[11]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[12]  Xiaoou Tang,et al.  Discriminative Sparse Neighbor Approximation for Imbalanced Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[16]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[17]  Hossein Mobahi,et al.  Large Margin Deep Networks for Classification , 2018, NeurIPS.

[18]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[19]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[21]  Robert B. Fisher,et al.  Non-melanoma skin lesion classification using colour image data in a hierarchical K-NN classifier , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[22]  Jun Li,et al.  Deep Face Recognition with Center Invariant Loss , 2017, ACM Multimedia.

[23]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[24]  Xiao Zhang,et al.  Range Loss for Deep Face Recognition with Long-Tailed Training Data , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[26]  Stefanos Zafeiriou,et al.  Marginal Loss for Deep Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Yong-Sheng Chen,et al.  Batch-normalized Maxout Network in Network , 2015, ArXiv.

[28]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[29]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[30]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Junping Du,et al.  Noisy Softmax: Improving the Generalization Ability of DCNN via Postponing the Early Softmax Saturation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Longbing Cao,et al.  Training deep neural networks on imbalanced data sets , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[33]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Carlos D. Castillo,et al.  Frontal to profile face verification in the wild , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[36]  Lance Chun Che Fung,et al.  Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm , 2010, ICONIP.

[37]  Xiaoming Liu,et al.  Feature Transfer Learning for Deep Face Recognition with Long-Tail Data , 2018, ArXiv.

[38]  Robert B. Fisher,et al.  A Color and Texture Based Hierarchical K-NN Approach to the Classification of Non-melanoma Skin Lesions , 2013 .

[39]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Stefanos Zafeiriou,et al.  AgeDB: The First Manually Collected, In-the-Wild Age Database , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  Antônio de Pádua Braga,et al.  Novel Cost-Sensitive Approach to Improve the Multilayer Perceptron Performance on Imbalanced Data , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[43]  Y. Yao,et al.  On Early Stopping in Gradient Descent Learning , 2007 .

[44]  Mohammed Bennamoun,et al.  A Guide to Convolutional Neural Networks for Computer Vision , 2018, A Guide to Convolutional Neural Networks for Computer Vision.

[45]  Erik Learned-Miller,et al.  Labeled Faces in the Wild : Updates and New Reporting Procedures , 2014 .

[46]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[47]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[48]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[49]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[50]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[52]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.

[53]  Witold Pedrycz,et al.  Dual autoencoders features for imbalance classification problem , 2016, Pattern Recognit..

[54]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[55]  Kihyuk Sohn,et al.  Feature Transfer Learning for Deep Face Recognition with Under-Represented Data , 2018 .

[56]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[57]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[58]  José Salvador Sánchez,et al.  Restricted Decontamination for the Imbalanced Training Sample Problem , 2003, CIARP.

[59]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[60]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[61]  Chang Huang,et al.  Targeting Ultimate Accuracy: Face Recognition via Deep Embedding , 2015, ArXiv.

[62]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.