Objective and diagnostic assessment of (isolated) word recognizers

The performance of speech recognizers is measured as a function of the variation of specific speech-production and speech-transmission parameters, e.g. parameters related to interspeaker and intraspeaker variation and to the effect of stress or noise. The method uses a consonant-vowel-consonant-word database with minimal-difference word sets, consisting of three groups related to initial consonants, final consonants, and vowels, By means of analysis-resynthesis the words are manipulated corresponding to the changes of the physical parameters which are observed in natural speech or between speakers under various environmental conditions. In order to define relevant parameter changes, an analysis of representative speech tokens is made. Some experimental results obtained for four commercial recognizers as well as for human listeners are given.<<ETX>>