A probabilistic model is developed to account for the error rate behavior of isolated word speech recognition systems. Two kinds of error are examined, confusion error, an a priori characterization of a recognizer which measures differences between words, and recognition rank error, an a posteriori characterization, which, in addition to taking into account differences between words, accounts for differences between different tokens of the same word. It is shown that these kinds of error can be modelled by describing recognition trials as Bernoulli trials. Good models of error rate behavior as a function of vocabulary size can be obtained if the distributions of confusion or rank number are considered to be mixtures of binomial distributions. The data obtained from a recent experiment in isolated word recognition with a large vocabulary, (1109 words), are used to evaluate the model. Model functions based on mixture distributions are fit by means of an optimization algorithm to experimental error rate functions obtained from each of six talkers and three partitions of the vocabulary. The results indicate that two-way mixture distributions account quite well for the experimental performance results.
[1]
L. Rabiner,et al.
A simplified, robust training procedure for speaker trained, isolated word recognition systems
,
1980
.
[2]
F. Itakura,et al.
Minimum prediction residual principle applied to speech recognition
,
1975
.
[3]
J. G. Wilpon,et al.
Isolated word recognition for large vocabularies
,
1982,
The Bell System Technical Journal.
[4]
M. Powell.
A New Algorithm for Unconstrained Optimization
,
1970
.
[5]
L. Erman,et al.
Noah-A Bottom-Up Word Hypothesizer for Large-Vocabulary Speech Understanding Systems
,
1981,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6]
L. Rabiner,et al.
Isolated and Connected Word Recognition - Theory and Selected Applications
,
1981,
IEEE Transactions on Communications.