Convex Hull-Based Multiobjective Genetic Programming for Maximizing Receiver Operating Characteristic Performance

The receiver operating characteristic (ROC) is commonly used to analyze the performance of classifiers in data mining. An important topic in ROC analysis is the ROC convex hull (ROCCH), which is the least convex majorant (LCM) of the empirical ROC curve and covers potential optima for a given set of classifiers. ROCCH maximization problems have been taken as multiobjective optimization problem (MOPs) in some previous work. However, the special characteristics of ROCCH maximization problem makes it different from traditional MOPs. In this paper, the difference will be discussed in detail and a new convex hull-based multiobjective genetic programming (CH-MOGP) is proposed to solve ROCCH maximization problems. Specifically, convex hull-based without redundancy sorting (CWR-sorting) is introduced, which is an indicator-based selection scheme that aims to maximize the area under the convex hull. A novel selection procedure is also proposed based on the proposed sorting scheme. It is hypothesized that by using a tailored indicator-based selection, CH-MOGP becomes more efficient for ROC convex hull approximation than algorithms that compute all Pareto optimal points. Empirical studies are conducted to compare CH-MOGP to both existing machine learning approaches and multiobjective genetic programming (MOGP) methods with classical selection schemes. Experimental results show that CH-MOGP outperforms the other approaches significantly.

[1]  Beatrice Lazzerini,et al.  A new multi-objective evolutionary algorithm based on convex hull for binary classifier optimization , 2007, 2007 IEEE Congress on Evolutionary Computation.

[2]  Tom Fawcett PRIE: a system for generating rulelists to maximize ROC performance , 2008, Data Mining and Knowledge Discovery.

[3]  Mark Johnston,et al.  Evolving Diverse Ensembles Using Genetic Programming for Classification With Unbalanced Data , 2013, IEEE Transactions on Evolutionary Computation.

[4]  Markus Wagner,et al.  Approximation-Guided Evolutionary Multi-Objective Optimization , 2011, IJCAI.

[5]  Peter A. Flach The Geometry of ROC Space: Understanding Machine Learning Metrics through ROC Isometrics , 2003, ICML.

[6]  Peter A. Flach,et al.  ROCCER: An Algorithm for Rule Learning Based on ROC Analysis , 2005, IJCAI.

[7]  Xin Yao,et al.  A Memetic Genetic Programming with decision tree-based local search for classification problems , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[8]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[9]  Peter A. Flach ROC Analysis , 2017, Encyclopedia of Machine Learning and Data Mining.

[10]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[11]  Markus Wagner,et al.  A fast approximation-guided evolutionary multi-objective algorithm , 2013, GECCO '13.

[12]  Beatrice Lazzerini,et al.  Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets , 2010, Soft Comput..

[13]  Mark Johnston,et al.  Multi-Objective Genetic Programming for Classification with Unbalanced Data , 2009, Australasian Conference on Artificial Intelligence.

[14]  Thomas Weise,et al.  Global Optimization Algorithms -- Theory and Application , 2009 .

[15]  Huimin Zhao,et al.  A multi-objective genetic programming approach to developing Pareto optimal decision trees , 2007, Decis. Support Syst..

[16]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[17]  Xin Yao,et al.  Using GP to evolve decision rules for classification in financial data sets , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).

[18]  Jafar Rezaei,et al.  Convex hull ranking algorithm for multi-objective evolutionary algorithms , 2011, Sci. Iran..

[19]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[20]  Michael Emmerich,et al.  Logarithmic-Time Updates in SMS-EMOA and Hypervolume-Based Archiving , 2013 .

[21]  András Kocsor,et al.  ROC analysis: applications to the classification of biological sequences and 3D structures , 2008, Briefings Bioinform..

[22]  Thomas Bartz-Beielstein,et al.  A Case Study on Multi-Criteria Optimization of an Event Detection Software under Limited Budgets , 2013, EMO.

[23]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[24]  Peter A. Flach,et al.  Repairing Concavities in ROC Curves , 2005, IJCAI.

[25]  Enrique Alba,et al.  The jMetal framework for multi-objective optimization: Design and architecture , 2010, IEEE Congress on Evolutionary Computation.

[26]  Alvaro A. Cárdenas,et al.  Optimal ROC Curve for a Combination of Classifiers , 2007, NIPS.

[27]  Ray A. Jarvis,et al.  On the Identification of the Convex Hull of a Finite Set of Points in the Plane , 1973, Inf. Process. Lett..

[28]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[29]  Tom Fawcett,et al.  Using rule sets to maximize ROC performance , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[30]  Xin Yao,et al.  Multiobjective genetic programming for maximizing ROC performance , 2014, Neurocomputing.

[31]  Stefan Roth,et al.  Covariance Matrix Adaptation for Multi-objective Optimization , 2007, Evolutionary Computation.

[32]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[33]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[34]  Nicola Beume,et al.  SMS-EMOA: Multiobjective selection based on dominated hypervolume , 2007, Eur. J. Oper. Res..

[35]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[36]  Saúl Zapotecas Martínez,et al.  A novel diversification strategy for multi-objective evolutionary algorithms , 2010, GECCO '10.