Convex Hull-Based Multi-objective Genetic Programming for Maximizing ROC Performance

ROC is usually used to analyze the performance of classifiers in data mining. ROC convex hull (ROCCH) is the least convex major-ant (LCM) of the empirical ROC curve, and covers potential optima for the given set of classifiers. Generally, ROC performance maximization could be considered to maximize the ROCCH, which also means to maximize the true positive rate (tpr) and minimize the false positive rate (fpr) for each classifier in the ROC space. However, tpr and fpr are conflicting with each other in the ROCCH optimization process. Though ROCCH maximization problem seems like a multi-objective optimization problem (MOP), the special characters make it different from traditional MOP. In this work, we will discuss the difference between them and propose convex hull-based multi-objective genetic programming (CH-MOGP) to solve ROCCH maximization problems. Convex hull-based sort is an indicator based selection scheme that aims to maximize the area under convex hull, which serves as a unary indicator for the performance of a set of points. A selection procedure is described that can be efficiently implemented and follows similar design principles than classical hyper-volume based optimization algorithms. It is hypothesized that by using a tailored indicator-based selection scheme CH-MOGP gets more efficient for ROC convex hull approximation than algorithms which compute all Pareto optimal points. To test our hypothesis we compare the new CH-MOGP to MOGP with classical selection schemes, including NSGA-II, MOEA/D) and SMS-EMOA. Meanwhile, CH-MOGP is also compared with traditional machine learning algorithms such as C4.5, Naive Bayes and Prie. Experimental results based on 22 well-known UCI data sets show that CH-MOGP outperforms significantly traditional EMOAs.

[1]  Mark Johnston,et al.  Multi-Objective Genetic Programming for Classification with Unbalanced Data , 2009, Australasian Conference on Artificial Intelligence.

[2]  Thomas Weise,et al.  Global Optimization Algorithms -- Theory and Application , 2009 .

[3]  Beatrice Lazzerini,et al.  Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets , 2010, Soft Comput..

[4]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[5]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Saúl Zapotecas Martínez,et al.  A novel diversification strategy for multi-objective evolutionary algorithms , 2010, GECCO '10.

[8]  Markus Wagner,et al.  Approximation-Guided Evolutionary Multi-Objective Optimization , 2011, IJCAI.

[9]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[10]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[11]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[12]  Nicola Beume,et al.  SMS-EMOA: Multiobjective selection based on dominated hypervolume , 2007, Eur. J. Oper. Res..

[13]  Alvaro A. Cárdenas,et al.  Optimal ROC Curve for a Combination of Classifiers , 2007, NIPS.

[14]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[15]  Beatrice Lazzerini,et al.  A new multi-objective evolutionary algorithm based on convex hull for binary classifier optimization , 2007, 2007 IEEE Congress on Evolutionary Computation.

[16]  Tom Fawcett PRIE: a system for generating rulelists to maximize ROC performance , 2008, Data Mining and Knowledge Discovery.

[17]  Jafar Rezaei,et al.  Convex hull ranking algorithm for multi-objective evolutionary algorithms , 2011, Sci. Iran..

[18]  Peter A. Flach,et al.  ROCCER: An Algorithm for Rule Learning Based on ROC Analysis , 2005, IJCAI.

[19]  Huimin Zhao,et al.  A multi-objective genetic programming approach to developing Pareto optimal decision trees , 2007, Decis. Support Syst..

[20]  Xin Yao,et al.  Multiobjective genetic programming for maximizing ROC performance , 2014, Neurocomputing.

[21]  Stefan Roth,et al.  Covariance Matrix Adaptation for Multi-objective Optimization , 2007, Evolutionary Computation.

[22]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[23]  R. Austria Declaration , 1987 .

[24]  Mark Johnston,et al.  Evolving Diverse Ensembles Using Genetic Programming for Classification With Unbalanced Data , 2013, IEEE Transactions on Evolutionary Computation.

[25]  Tom Fawcett,et al.  Using rule sets to maximize ROC performance , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[26]  Shan-Fan Ji,et al.  The Multi-objective Differential Evolution Algorithm Based on Quick Convex Hull Algorithms , 2009, ICNC.

[27]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[28]  G. Chapman,et al.  [Medical decision making]. , 1976, Lakartidningen.

[29]  Xin Yao,et al.  Using GP to evolve decision rules for classification in financial data sets , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).

[30]  Ray A. Jarvis,et al.  On the Identification of the Convex Hull of a Finite Set of Points in the Plane , 1973, Inf. Process. Lett..

[31]  Xin Yao,et al.  A Memetic Genetic Programming with decision tree-based local search for classification problems , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[32]  Ji Shan-Fan,et al.  The Multi-objective Differential Evolution Algorithm Based on Quick Convex Hull Algorithms , 2009, 2009 Fifth International Conference on Natural Computation.