Analysis of an evolutionary RBFN design algorithm, CO2RBFN, for imbalanced data sets

In the classification problem field, we often encounter many real application areas in which the data do not have an equitable distribution among the different classes of the problem. In such cases, we are dealing with the so-called imbalanced data sets. This scenario has significant interest since standard classifiers are often biased towards the majority classes, whereas the minority ones tend to have a higher reward as they usually define the concepts of interest from the learning point of view. The aim of this paper is to analyse the performance of CO^2RBFN, a evolutionary cooperative-competitive model for the design of radial-basis function networks applied to classification problems on imbalanced domains, and to study its cooperation with a well-known pre-processing method, the ''synthetic minority over-sampling technique''. The good performance of CO^2RBFN is shown through an experimental study carried out on a large collection of imbalanced data sets where we compare, by means of a proper statistical study, the behaviour of our model with many representative neural networks algorithms, the C4.5 decision tree and a hierarchical fuzzy rule-based classification system.

[1]  Jing Peng,et al.  Classifying Unbalanced Pattern Groups by Training Neural Network , 2006, ISNN.

[2]  Ignacio Rojas,et al.  A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks , 2007, Soft Comput..

[3]  Minqiang Li,et al.  Improving multiclass pattern recognition with a co-evolutionary RBFNN , 2008, Pattern Recognit. Lett..

[4]  María José del Jesús,et al.  Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets , 2009, Int. J. Approx. Reason..

[5]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[6]  Juan Julián Merelo Guervós,et al.  Evolving RBF neural networks for time-series forecasting with EvRBF , 2004, Inf. Sci..

[7]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[9]  Christian W. Dawson,et al.  A review of genetic algorithms applied to training radial basis function networks , 2004, Neural Computing & Applications.

[10]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[11]  Antonio J. Rivera,et al.  CO2RBFN: an evolutionary cooperative–competitive RBFN design algorithm for classification problems , 2010, Soft Comput..

[12]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[13]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[14]  Nitesh V. Chawla,et al.  SPECIAL ISSUE ON LEARNING FROM IMBALANCED DATA SETS , 2004 .

[15]  Gene H. Golub,et al.  Matrix computations , 1983 .

[16]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[17]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[18]  Ignacio Rojas,et al.  Statistical Analysis of the Main Parameters in the Definition of Radial Bases Function Networks , 1997, IWANN.

[19]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[20]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[21]  Kemal Kilic,et al.  Comparison of Different Strategies of Utilizing Fuzzy Clustering in Structure Identification , 2007, Inf. Sci..

[22]  Bruce A. Whitehead,et al.  Cooperative-competitive genetic evolution of radial basis function centers and widths for time series prediction , 1996, IEEE Trans. Neural Networks.

[23]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[24]  De-Shuang Huang,et al.  A Hybrid Forward Algorithm for RBF Neural Network Construction , 2006, IEEE Transactions on Neural Networks.

[25]  Raju S. Bapi,et al.  An Unbalanced Data Classification Model Using Hybrid Sampling Technique for Fraud Detection , 2007, PReMI.

[26]  Adil Masood Siddiqui,et al.  A locally constrained radial basis function for registration and warping of images , 2009, Pattern Recognit. Lett..

[27]  Hong Guo,et al.  Neural Learning from Unbalanced Data , 2004, Applied Intelligence.

[28]  David E. Goldberg,et al.  Facetwise Analysis of XCS for Problems With Class Imbalances , 2009, IEEE Transactions on Evolutionary Computation.

[29]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[30]  Witold Pedrycz,et al.  Conditional fuzzy clustering in the design of radial basis function neural networks , 1998, IEEE Trans. Neural Networks.

[31]  Yuehwern Yih,et al.  Knowledge acquisition through information granulation for imbalanced data , 2006, Expert Syst. Appl..

[32]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[33]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[34]  Roman Neruda,et al.  Learning methods for radial basis function networks , 2005, Future Gener. Comput. Syst..

[35]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[36]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[37]  Chuen-Tsai Sun,et al.  Functional equivalence between radial basis function networks and fuzzy inference systems , 1993, IEEE Trans. Neural Networks.

[38]  Bernhard Sendhoff,et al.  Extracting Interpretable Fuzzy Rules from RBF Networks , 2003, Neural Processing Letters.

[39]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[40]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[41]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[42]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[43]  Roberto Hornero,et al.  Radial basis function classifiers to help in the diagnosis of the obstructive sleep apnoea syndrome from nocturnal oximetry , 2008, Medical & Biological Engineering & Computing.

[44]  David P. Williams,et al.  Mine Classification With Imbalanced Data , 2009, IEEE Geoscience and Remote Sensing Letters.

[45]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[46]  Zhong-Qiu Zhao,et al.  A novel modular neural network for imbalanced classification problems , 2009, Pattern Recognit. Lett..

[47]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[48]  Yanchun Liang,et al.  Optimal partition algorithm of the RBF neural network and its application to financial time series forecasting , 2005, Neural Computing & Applications.

[49]  Chao-Ton Su,et al.  An Evaluation of the Robustness of MTS for Imbalanced Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[50]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[51]  Hewijin Christine Jiau,et al.  Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem , 2006 .

[52]  Narasimhan Sundararajan,et al.  Risk-sensitive loss functions for sparse multi-category classification problems , 2008, Inf. Sci..

[53]  Tommy W. S. Chow,et al.  Induction machine fault detection using SOM-based RBF neural networks , 2004, IEEE Transactions on Industrial Electronics.

[54]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[55]  Rui Liu,et al.  Chinese Text Classification Based on the BVB Model , 2008, 2008 Fourth International Conference on Semantics, Knowledge and Grid.

[56]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[57]  T.M. Padmaja,et al.  Majority filter-based minority prediction (MFMP): An approach for unbalanced datasets , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[58]  José Salvador Sánchez,et al.  On the k-NN performance in a challenging scenario of imbalance and overlapping , 2008, Pattern Analysis and Applications.

[59]  Kok Kiong Tan,et al.  Adaptive neural network algorithm for control design of rigid-link electrically driven robots , 2008, Neurocomputing.

[60]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[61]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[62]  Xiang Peng,et al.  Robust BMPM training based on second-order cone programming and its application in medical diagnosis , 2008, Neural Networks.

[63]  Ester Bernadó-Mansilla,et al.  Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..

[64]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[65]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[66]  James C. Bezdek,et al.  Nearest prototype classifier designs: An experimental study , 2001, Int. J. Intell. Syst..

[67]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[68]  Chris T. Kiranoudis,et al.  Radial Basis Function Neural Networks Classification for the Recognition of Idiopathic Pulmonary Fibrosis in Microscopic Images , 2008, IEEE Transactions on Information Technology in Biomedicine.

[69]  Bernhard Sick,et al.  Evolutionary optimization of radial basis function classifiers for data mining applications , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[70]  José Martínez Sotoca,et al.  Improving the Performance of the RBF Neural Networks Trained with Imbalanced Samples , 2007, IWANN.

[71]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[72]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[73]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[74]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[75]  Ester Bernadó-Mansilla,et al.  Fuzzy-UCS: A Michigan-Style Learning Fuzzy-Classifier System for Supervised Learning , 2009, IEEE Transactions on Evolutionary Computation.

[76]  Stavros J. Perantonis,et al.  Two highly efficient second-order algorithms for training feedforward networks , 2002, IEEE Trans. Neural Networks.

[77]  Ebrahim H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Hum. Comput. Stud..

[78]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[79]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Evolutionary Radial Basis Functions for Credit Assessment , 2005, Applied Intelligence.

[80]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[81]  L Boddy,et al.  Training radial basis function neural networks: effects of training set size and imbalanced training sets. , 2000, Journal of microbiological methods.

[82]  Mo-Yuen Chow,et al.  Power Distribution Fault Cause Identification With Imbalanced Data Using the Data Mining-Based Fuzzy Classification $E$-Algorithm , 2007, IEEE Transactions on Power Systems.

[83]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[84]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[85]  Nikola K. Kasabov,et al.  Adaptive Training of Radial Basis Function Networks Based on Cooperative Evolution and Evolutionary Programming , 1997, ICONIP.

[86]  Chunlin Zhang,et al.  Intrusion detection using hierarchical neural networks , 2005, Pattern Recognit. Lett..

[87]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[88]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[89]  HerreraFrancisco,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining , 2010 .

[90]  Meng Joo Er,et al.  High-speed face recognition based on discrete cosine transform and RBF neural networks , 2005, IEEE Transactions on Neural Networks.

[91]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .