论文信息 - On extending F-measure and G-mean metrics to multi-class problems

On extending F-measure and G-mean metrics to multi-class problems

The evaluation of classifiers is not an easy task. There are various ways of testing them and measures to estimate their performance. The great majority of these measures were defined for two-class problems and there is not a consensus about how to generalize them to multiclass problems. This paper proposes the extension of the F-measure and G-mean in the same fashion as carried out with the AUC. Some datasets with diverse characteristics are used to generate fuzzy classifiers and C4.5 trees. The most common evaluation metrics are implemented and they are compared in terms of their output values: the greater the response the more optimistic the measure. The results suggest that there are two well-behaved measures in opposite roles: one is always optimistic and the other always pessimistic.

N. F. F. Ebecken | R. P. Espíndola | N. Ebecken

[1] Leo Egghe,et al. A Theoretical Study of Recall and Precision Using a Topological Approach to Information Retrieval , 1998, Inf. Process. Manag..

[2] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[3] Stan Matwin,et al. Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[4] øöö Blockinøø. Well-Trained PETs : Improving Probability Estimation , 2000 .

[5] Gary M. Weiss. Mining with rarity: a unifying framework , 2004, SKDD.

[6] Tom Fawcett,et al. ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[7] Foster J. Provost,et al. Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[8] K. M. Wade,et al. Performance analysis for machine-learning experiments using small data sets , 2003 .

[9] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[10] Ron Kohavi,et al. The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[11] David J. Hand,et al. A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[12] Ian Witten,et al. Data Mining , 2000 .