Impact studies and sensitivity analysis in medical data mining with ROC-based genetic learning

ROC curves have been used for a fair comparison of machine learning algorithms since the late 90's. Accordingly, the area under the ROC curve (AUC) is nowadays considered a relevant learning criterion, accommodating imbalanced data, misclassification costs and noisy data. We show how a genetic algorithm-based optimization of the AUC criterion can be exploited for impact studies and sensitivity analysis. The approach is illustrated on the Atherosclerosis Identification problem, PKDD 2002 Challenge.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  Thomas Bäck,et al.  Evolutionary Algorithms in Theory and Practice , 1996 .

[3]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[4]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[5]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[6]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[7]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[8]  Michèle Sebag,et al.  ROC-Based Evolutionary Learning: Application to Medical Data Mining , 2003, Artificial Evolution.

[9]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[10]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  David B. Fogel,et al.  Linear and neural models for classifying breast masses , 1998, IEEE Transactions on Medical Imaging.

[14]  Charles X. Ling,et al.  AUC: A Better Measure than Accuracy in Comparing Learning Algorithms , 2003, Canadian Conference on AI.

[15]  D. Fogel Evolutionary algorithms in theory and practice , 1997, Complex..

[16]  L. Breiman Arcing Classifiers , 1998 .

[17]  M. Sebag,et al.  Atherosclerosis Risk Identification and Visual Analysis ( PKDD 2002 Challenge ) , 2022 .

[18]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .