Figures of merit for assessing connected-word recognisers

Abstract This paper is concerned mainly with the choice of a figure of merit for representing the performance of connected-word recognisers when DP word-symbol sequence matching is used for the scoring. Properties of the DP scoring method are discussed. Experimental tests using data from the DARPA Resource Management Task confirm a prediction made from random number simulations that DP scoring overestimates substitution errors and underestimates insertion and deletion errors. As a result, the commonly used total error measure has a particularly large bias. The use of an alternative measure, percent correct, results in lower bias but ignores insertion errors. A new figure of merit, weighted total errors , takes all three kinds of errors into account and minimises bias. Finally, some more sophisticated figures of merit are discussed briefly.

[1]  M. Hunt,et al.  Evaluating the performance of connected-word speech recognition systems , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[2]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.