2000 NIST EVALUATION OF CONVERSATIONAL SPEECH RECOGNITION OVER THE TELEPHONE: ENGLISH AND MANDAR IN PERFORMANCE RESULTS

This paper documents the use of conversational telephone speech test materials in the NIST coordinated evaluation conducted early in 2000. The primary evaluation was of General American English speech, but a subsidiary evaluation of Mandarin speech was also offered. The primary test data consisted of twenty conversations collected for the original Switchboard Corpus but not released with the published corpus and twenty conversations from the CallHome English Corpus. The lowest English word error rates this year were 19.3% for the Switchboard-type data and 31.4% for the CallHome data. These are considerably lower error rates than those achieved in previous evaluations, the most recent of which was in 1998. These error rate reductions were due in part to improved recognition systems, but also in large part to these test sets being easier than those used in previous evaluations. We discuss in the Appendices some reasons for these test sets being easier than previous test sets.