A Weight-Selection Strategy on Training Deep Neural Networks for Imbalanced Classification

Deep Neural Networks (DNN) have recently received great attention due to their superior performance in many machining-learning problems. However, the use of DNN is still impeded, if the input data is imbalanced. Imbalanced classification refers to the problem that one class contains a much smaller number of samples than the others in classification. It poses a great challenge to existing classifiers including DNN, due to the difficulty in recognizing the minority class. So far, there are still limited studies on how to train DNN for imbalanced classification. In this study, we propose a new strategy to reduce over-fitting in training DNN for imbalanced classification based on weight selection. In training DNN, by splitting the original training set into two subsets, one used for training to update weights, and the other for validation to select weights, the weights that render the best performance in the validation set would be selected. To our knowledge, it is the first systematic study to examine a weight-selection strategy on training DNN for imbalanced classification. Demonstrated by experiments on 10 imbalanced datasets obtained from MNIST, the DNN trained by our new strategy outperformed the DNN trained by a standard strategy and the DNN trained by cost-sensitive learning with statistical significance (p = 0.00512). Surprisingly, the DNN trained by our new strategy was trained on 20% less training images, corresponding to 12,000 less training images, but still achieved an outperforming performance in all 10 imbalanced datasets. The source code is available in https://github.com/antoniosehk/WSDeepNN.

[1]  Kai Ming Ting,et al.  An Empirical Study of MetaCost Using Boosting Algorithms , 2000, ECML.

[2]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[3]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[4]  Yan Wang,et al.  DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Wei Gao,et al.  An imbalanced data classification algorithm of improved autoencoder neural network , 2016, 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI).

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[9]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[10]  Geoffrey E. Hinton,et al.  Training Recurrent Neural Networks , 2013 .

[11]  Yijing Li,et al.  Learning from class-imbalanced data: Review of methods and applications , 2017, Expert Syst. Appl..

[12]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[13]  Longbing Cao,et al.  Training deep neural networks on imbalanced data sets , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  Witold Pedrycz,et al.  Dual autoencoders features for imbalance classification problem , 2016, Pattern Recognit..

[17]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[19]  David Masko,et al.  The Impact of Imbalanced Training Data for Convolutional Neural Networks , 2015 .

[20]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[21]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.