To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

Real-world visual data often exhibits a long-tailed distribution, where some ''head'' classes have a large number of samples, yet only a few samples are available for ''tail'' classes. Such imbalanced distribution causes a great challenge for learning a deep neural network, which can be boiled down into a dilemma: on the one hand, we prefer to increase the exposure of tail class samples to avoid the excessive dominance of head classes in the classifier training. On the other hand, oversampling tail classes makes the network prone to over-fitting, since head class samples are often consequently under-represented. To resolve this dilemma, in this paper, we propose a simple-yet-effective auxiliary learning approach. The key idea is to split a network into a classifier part and a feature extractor part, and then employ different training strategies for each part. Specifically, to promote the awareness of tail-classes, a class-balanced sampling scheme is utilised for training both the classifier and the feature extractor. For the feature extractor, we also introduce an auxiliary training task, which is to train a classifier under the regular random sampling scheme. In this way, the feature extractor is jointly trained from both sampling strategies and thus can take advantage of all training data and avoid the over-fitting issue. Apart from this basic auxiliary task, we further explore the benefit of using self-supervised learning as the auxiliary task. Without using any bells and whistles, our model achieves superior performance over the state-of-the-art solutions.

[1]  Ling Shao,et al.  Gaussian Affinity for Max-Margin Class Imbalanced Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Lei Zhang,et al.  One-shot Face Recognition by Promoting Underrepresented Classes , 2017, ArXiv.

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[7]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ling Shao,et al.  Striking the Right Balance With Uncertainty , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Bolei Zhou,et al.  Places: An Image Database for Deep Scene Understanding , 2016, ArXiv.

[10]  Josef Kittler,et al.  A Multiple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling , 2009, MCS.

[11]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[12]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[13]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[17]  Martial Hebert,et al.  Learning to Model the Tail , 2017, NIPS.

[18]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[19]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[20]  Xiao Zhang,et al.  Range Loss for Deep Face Recognition with Long-Tailed Training Data , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Wei Wu,et al.  Dynamic Curriculum Learning for Imbalanced Data Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Yijing Li,et al.  Learning from class-imbalanced data: Review of methods and applications , 2017, Expert Syst. Appl..

[23]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Andrew J. Davison,et al.  Self-Supervised Generalisation with Meta Auxiliary Learning , 2019, NeurIPS.

[25]  Colin Wei,et al.  Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss , 2019, NeurIPS.

[26]  Marco Körner,et al.  Auxiliary Tasks in Multi-task Learning , 2018, ArXiv.

[27]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Subhransu Maji,et al.  When Does Self-supervision Improve Few-shot Learning? , 2019, ECCV.

[29]  Yun Fu,et al.  Low-Shot Face Recognition with Hybrid Classifiers , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[30]  Patrick Pérez,et al.  Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Xiaogang Wang,et al.  Factors in Finetuning Deep Model for Object Detection with Long-Tail Distribution , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[33]  Alexander Kolesnikov,et al.  Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Galina Lavrentyeva,et al.  Doppelganger Mining for Face Representation Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[35]  Anelia Angelova,et al.  Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[38]  Stella X. Yu,et al.  Large-Scale Long-Tailed Recognition in an Open World , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).