Online Sequential Learning based on Enhanced Extreme Learning Machine using Left or Right Pseudo-inverse

The class imbalance problem has been reported as an importan t ch llenge in various fields such as Pattern Recognition, Data Mining and Machine Learning. A less explo red research area is related to how to evaluate classifiers on imbalanced data sets. This work analyzes the b ehaviour of performance measures widely used on imbalanced problems, as well as other metrics recently prop osed in the literature. We perform two theoretical analysis based on Pearson correlation and operations for a 2 ×2 confusion matrix with the aim to show the strengths and weaknesses of those performance metrics in th e presence of skewed distributions.

[1]  Robert K. L. Gay,et al.  Error Minimized Extreme Learning Machine With Growth of Hidden Nodes and Incremental Learning , 2009, IEEE Transactions on Neural Networks.

[2]  José Salvador Sánchez,et al.  Theoretical Analysis of a Performance Measure for Imbalanced Data , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[4]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[5]  Mohammad Khalilia,et al.  Predicting disease risks from highly imbalanced data using random forest , 2011, BMC Medical Informatics Decis. Mak..

[6]  Antoine Geissbühler,et al.  Learning from imbalanced data in surveillance of nosocomial infection , 2006, Artif. Intell. Medicine.

[7]  Zhihua Cai,et al.  Evaluation Measures of the Classification Performance of Imbalanced Data Sets , 2009 .

[8]  Cheng G. Weng,et al.  A New Evaluation Measure for Imbalanced Datasets , 2008, AusDM.

[9]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[10]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[11]  Kenneth Kennedy,et al.  Learning without Default: A Study of One-Class Classification and the Low-Default Portfolio Problem , 2009, AICS.

[12]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[13]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[14]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[15]  M. Viberg,et al.  Adaptive neural nets filter using a recursive Levenberg-Marquardt search direction , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[18]  Abhijit S. Pandya,et al.  The Impact of Gene Selection on Imbalanced Microarray Expression Data , 2009, BICoB.

[19]  Vijanth S. Asirvadam,et al.  Parallel and separable recursive Levenberg-Marquardt training algorithm , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[20]  Kar-Ann Toh,et al.  Deterministic Neural Classification , 2008, Neural Computation.

[21]  Taghi M. Khoshgoftaar,et al.  A Study on the Relationships of Classifier Performance Metrics , 2009, 2009 21st IEEE International Conference on Tools with Artificial Intelligence.

[22]  D. Serre Matrices: Theory and Applications , 2002 .

[23]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[24]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[25]  Vasile Palade,et al.  A New Performance Measure for Class Imbalance Learning. Application to Bioinformatics Problems , 2009, 2009 International Conference on Machine Learning and Applications.

[26]  Nikolaos M. Avouris,et al.  EVALUATION OF CLASSIFIERS FOR AN UNEVEN CLASS DISTRIBUTION PROBLEM , 2006, Appl. Artif. Intell..

[27]  Vasile Palade,et al.  Optimized Precision - A New Measure for Classifier Performance Evaluation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[28]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[29]  Charles X. Ling,et al.  Constructing New and Better Evaluation Measures for Machine Learning , 2007, IJCAI.

[30]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[31]  Hongming Zhou,et al.  Optimization method based extreme learning machine for classification , 2010, Neurocomputing.

[32]  Narasimhan Sundararajan,et al.  An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Geoffrey E. Hinton,et al.  Learning representations of back-propagation errors , 1986 .

[34]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[35]  Taghi M. Khoshgoftaar,et al.  Comparison of Four Performance Metrics for Evaluating Sampling Techniques for Low Quality Class-Imbalanced Data , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[36]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[37]  Gary E. Birch,et al.  Comparison of Evaluation Metrics in Classification Applications with Imbalanced Datasets , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[38]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.