PERFORMANCE MEASURES FOR INFORMATION EXTRACTION

While precision and recall have served the information extraction community well as two separate measures of system performance, we show that the F -measure, the weighted harmonic mean of precision and recall, exhibits certain undesirable behaviors. To overcome these limitations, we define an error measure, the slot error rate, which combines the different types of error directly, without having to resort to precision and recall as preliminary measures. The slot error rate is analogous to the word error rate that is used for measuring speech recognition performance; it is intended to be a measure of the cost to the user for the system to make the different types of errors.

[1]  Nancy Chinchor Four scorers and seven years ago: the scoring method for MUC-6 , 1995, MUC.

[2]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.