Analysis and Visualization of Classifier Performance with Nonuniform Class and Cost Distributions

Applications of machine learning have shown repeatedly that the standard assumptions of uniform class distribution and uniform misclassification costs rarely hold. Little is known about how to select classifiers when error costs and class distributions are not known precisely at training time, or when they can change. We present a method for analyzing and visualizing the performance of classification methods that is robust to changing distributions and allows a sensitivity analysis if a range of costs is known. The method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifters. We then demonstrate analysis and visualization properties of the method.

[1]  Inderjeet Mani,et al.  Machine Learning of User Profiles: Representational Issues , 1996, AAAI/IAAI, Vol. 1.

[2]  S. Clearwater,et al.  A rule-learning program in high energy physics event classification , 1991 .

[3]  J R Beck,et al.  The use of relative operating characteristic (ROC) curves in test performance evaluation. , 1986, Archives of pathology & laboratory medicine.

[4]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[5]  Moninder Singh,et al.  Learning Goal Oriented Bayesian Networks for Telecommunications Risk Management , 1996, ICML.

[6]  A. Fasoli [Clinical decision analysis]. , 1986, Annali italiani di medicina interna : organo ufficiale della Societa italiana di medicina interna.

[7]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[9]  Tom Fawcett,et al.  Combining Data Mining and Machine Learning for Effective User Profiling , 1996, KDD.

[10]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[11]  Foster J. Provost,et al.  Inductive Policy , 1992, AAAI.

[12]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[13]  Bruce A. Draper,et al.  Goal-Directed Classification Using Linear Machine Decision Trees , 1994, IEEE Trans. Pattern Anal. Mach. Intell..