Deep MLPs for Imbalanced Classification

Abstract–Classification over imbalanced datasets is a highly interesting topic given that many real-world classification problems present a concrete class with a much smaller number of patterns than the others. In this work we shall explore the use of large, fully connected and potentially deep MLPs in such problems. We will consider simple MLPs, with ReLU activations, softmax outputs and categorical cross-entropy loss, showing that, when properly regularized, these relatively straightforward MLP models yield state of the art results in terms of the areas under the ROC curve for both two-class problems (the usual focus in imbalanced classification) as well as for multi-class problems.

[1]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  José Hernández-Orallo,et al.  Volume under the ROC Surface for Multi-class Problems , 2003, ECML.

[3]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[4]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[5]  Francisco Herrera,et al.  Class Switching according to Nearest Enemy Distance for learning from highly imbalanced data-sets , 2017, Pattern Recognit..

[6]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[7]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[11]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[12]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[14]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[15]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[16]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[17]  Joan Bruna,et al.  Mathematics of Deep Learning , 2017, ArXiv.

[19]  Stefano Soatto,et al.  Emergence of invariance and disentangling in deep representations , 2017 .

[20]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[21]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[22]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[23]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[24]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[25]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[26]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[27]  Bo Tang,et al.  GIR-based ensemble sampling approaches for imbalanced learning , 2017, Pattern Recognit..

[28]  Monson H. Hayes,et al.  Training neural network classifiers through Bayes risk minimization applying unidimensional Parzen windows , 2018, Pattern Recognit..

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..