Trainable Undersampling for Class-Imbalance Learning

Undersampling has been widely used in the class-imbalance learning area. The main deficiency of most existing undersampling methods is that their data sampling strategies are heuristic-based and independent of the used classifier and evaluation metric. Thus, they may discard informative instances for the classifier during the data sampling. In this work, we propose a meta-learning method built on the undersampling to address this issue. The key idea of this method is to parametrize the data sampler and train it to optimize the classification performance over the evaluation metric. We solve the non-differentiable optimization problem for training the data sampler via reinforcement learning. By incorporating evaluation metric optimization into the data sampling process, the proposed method can learn which instance should be discarded for the given classifier and evaluation metric. In addition, as a data level operation, this method can be easily applied to arbitrary evaluation metric and classifier, including non-parametric ones (e.g., C4.5 and KNN). Experimental results on both synthetic and realistic datasets demonstrate the effectiveness of the proposed method.

[1]  María José del Jesús,et al.  A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets , 2008, Fuzzy Sets Syst..

[2]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[5]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[6]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[7]  Vaishali Ganganwar,et al.  An overview of classification algorithms for imbalanced datasets , 2012 .

[8]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[9]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[10]  Siegfried Wahl,et al.  Leveraging uncertainty information from deep neural networks for disease detection , 2016, Scientific Reports.

[11]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[12]  Elad Eban,et al.  Scalable Learning of Non-Decomposable Objectives , 2016, AISTATS.

[13]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[14]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[15]  Reid A. Johnson,et al.  Calibrating Probability with Undersampling for Unbalanced Classification , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[16]  Yves Grandvalet,et al.  Optimizing F-Measures by Cost-Sensitive Classification , 2014, NIPS.

[17]  Szymon Jaroszewicz,et al.  Efficient AUC Optimization for Classification , 2007, PKDD.

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[21]  Taghi M. Khoshgoftaar,et al.  Experimental perspectives on learning from imbalanced data , 2007, ICML '07.

[22]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[23]  Sungzoon Cho,et al.  EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems , 2006, ICONIP.

[24]  David A. Cieslak,et al.  Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[25]  Jerzy Stefanowski,et al.  Neighbourhood sampling in bagging for imbalanced data , 2015, Neurocomputing.

[26]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[27]  Dale Schuurmans,et al.  Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.

[28]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[29]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[30]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.