Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets

We exploit an evolutionary three-objective optimization algorithm to produce a Pareto front approximation composed of fuzzy rule-based classifiers (FRBCs) with different trade-offs between accuracy (expressed in terms of sensitivity and specificity) and complexity (computed as sum of the conditions in the antecedents of the classifier rules). Then, we use the ROC convex hull method to select the potentially optimal classifiers in the projection of the Pareto front approximation onto the ROC plane. Our method was tested on 13 highly imbalanced datasets and compared with 2 two-objective evolutionary approaches and one heuristic approach to FRBC generation, and with three well-known classifiers. We show by the Wilcoxon signed-rank test that our three-objective optimization approach outperforms all the other techniques, except for one classifier, in terms of the area under the ROC convex hull, an accuracy measure used to globally compare different classification approaches. Further, all the FRBCs in the ROC convex hull are characterized by a low value of complexity. Finally, we discuss how, the misclassification costs and the class distributions are fixed, we can select the most suitable classifier for the specific application. We show that the FRBC selected from the convex hull produced by our three-objective optimization approach achieves the lowest classification cost among the techniques used as comparison in two specific medical applications.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[3]  Carey E. Priebe,et al.  COMPARATIVE EVALUATION OF PATTERN RECOGNITION TECHNIQUES FOR DETECTION OF MICROCALCIFICATIONS IN MAMMOGRAPHY , 1993 .

[4]  John H. Lilly,et al.  Evolutionary design of a fuzzy classifier from data , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  María José del Jesús,et al.  Genetic feature selection in a fuzzy rule-based classification system learning process for high-dimensional problems , 2001, Inf. Sci..

[7]  Hisao Ishibuchi,et al.  Rule weight specification in fuzzy rule-based classification systems , 2005, IEEE Transactions on Fuzzy Systems.

[8]  Adam Kowalczyk,et al.  Extreme re-balancing for SVMs: a case study , 2004, SKDD.

[9]  Kalyanmoy Deb,et al.  MULTI-OBJECTIVE FUNCTION OPTIMIZATION USING NON-DOMINATED SORTING GENETIC ALGORITHMS , 1994 .

[10]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[11]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[12]  Jonathan E. Fieldsend,et al.  Multiobjective optimization of safety related systems: an application to short-term conflict alert , 2006, IEEE Transactions on Evolutionary Computation.

[13]  J. Casillas Interpretability issues in fuzzy modeling , 2003 .

[14]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[15]  F. Herrera,et al.  A proposal on reasoning methods in fuzzy rule-based classification systems , 1999 .

[16]  Luis Magdalena,et al.  A Multiobjective Genetic Learning Process for joint Feature Selection and Granularity and Contexts Learning in Fuzzy Rule-Based Classification Systems , 2003 .

[17]  Marco Laumanns,et al.  SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization , 2002 .

[18]  María José del Jesús,et al.  Genetic tuning of fuzzy rule deep structures preserving interpretability and its interaction with fuzzy rule set reduction , 2005, IEEE Transactions on Fuzzy Systems.

[19]  Giovanna Castellano,et al.  On the Role of Interpretability in Fuzzy Data Mining , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[20]  Beatrice Lazzerini,et al.  A Pareto-based multi-objective evolutionary approach to the identification of Mamdani fuzzy systems , 2007, Soft Comput..

[21]  M. Anastasio,et al.  Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves , 1999, IEEE Transactions on Medical Imaging.

[22]  Hisao Ishibuchi,et al.  Selecting fuzzy if-then rules for classification problems using genetic algorithms , 1995, IEEE Trans. Fuzzy Syst..

[23]  Lothar Thiele,et al.  Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[24]  Parita Patel,et al.  Classification and Modeling of Internet Applications , 2004 .

[25]  Hisao Ishibuchi,et al.  Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining , 2004, Fuzzy Sets Syst..

[26]  Hisao Ishibuchi,et al.  Classification and modeling with linguistic information granules - advanced approaches to linguistic data mining , 2004, Advanced information processing.

[27]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[28]  M. Ehrgott Multiobjective Optimization , 2008, AI Mag..

[29]  Hisao Ishibuchi,et al.  Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems , 1997, Fuzzy Sets Syst..

[30]  F. Herrera,et al.  Accuracy Improvements in Linguistic Fuzzy Modeling , 2003 .

[31]  Carlos A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Comput. Intell. Mag..

[32]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[33]  F. Gomide,et al.  Ten years of genetic fuzzy systems: current framework and new trends , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[34]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[35]  C. L. Karr,et al.  Fuzzy control of pH using genetic algorithms , 1993, IEEE Trans. Fuzzy Syst..

[36]  B. Lazzerini,et al.  A CAD System for Lung Nodule Detection based on an Anatomical Model and a Fuzzy Neural Network , 2006, NAFIPS 2006 - 2006 Annual Meeting of the North American Fuzzy Information Processing Society.

[37]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[38]  Rudolf Kruse,et al.  Obtaining interpretable fuzzy classification rules from medical data , 1999, Artif. Intell. Medicine.

[39]  Gary B. Lamont,et al.  Applications Of Multi-Objective Evolutionary Algorithms , 2004 .

[40]  María José del Jesús,et al.  Special Issue on Genetic Fuzzy Systems and the Interpretability-Accuracy Trade-off , 2007, Int. J. Approx. Reason..

[41]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[42]  Robert M. Nishikawa,et al.  Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach , 1998, IEEE Transactions on Medical Imaging.

[43]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[44]  K. Awai,et al.  Pulmonary nodules at chest CT: effect of computer-aided diagnosis on radiologists' detection performance. , 2004, Radiology.

[45]  Hisao Ishibuchi,et al.  A weighted fuzzy classifier and its application to image processing tasks , 2007, Fuzzy Sets Syst..

[46]  Hisao Ishibuchi,et al.  Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning , 2007, Int. J. Approx. Reason..

[47]  Francisco Herrera,et al.  A Multi-Objective Genetic Algorithm for Tuning and Rule Selection to Obtain Accurate and Compact Linguistic Fuzzy Rule-Based Systems , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[48]  Lothar Thiele,et al.  Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[49]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[50]  Shinn-Jang Ho,et al.  Design of accurate classifiers with a compact fuzzy-rule base using an evolutionary scatter partition of feature space , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[51]  Robert L. Stewart,et al.  Multiobjective Evolutionary Algorithms on Complex Networks , 2006, EMO.

[52]  Hisao Ishibuchi,et al.  Multiobjective Genetic Fuzzy Systems: Review and Future Research Directions , 2007, 2007 IEEE International Fuzzy Systems Conference.

[53]  Eghbal G. Mansoori,et al.  A weighting function for improving fuzzy classification systems performance , 2007, Fuzzy Sets Syst..

[54]  María José del Jesús,et al.  A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets , 2008, Fuzzy Sets Syst..

[55]  Francisco Herrera,et al.  Genetic fuzzy systems: taxonomy, current research trends and prospects , 2008, Evol. Intell..

[56]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[57]  Hong Yan,et al.  Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition , 1996, Advances in Fuzzy Systems - Applications and Theory.

[58]  John Yen,et al.  Improving the interpretability of TSK fuzzy models by combining global learning and local learning , 1998, IEEE Trans. Fuzzy Syst..

[59]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.