Talker recognition in tandem with talker-independent isolated word recognition

A talker recognition system operating in tandem with a talker-independent isolated word recognizer is described and evaluated. The word recognizer uses a small set of reference templates for each vocabulary word. Each set is intended to span and typify individual talker templates over a large population of talkers. Word recognition decisions are based on template distance scores obtained by comparing processed input utterances to each set of reference templates. The distribution of distance scores for the templates corresponding to the actual word input has been found to be reasonably consistent for individual talkers, and to vary sufficiently from talker to talker to provide the basis for a talker recognition capability. A system has been implemented to exploit this capability. An evaluation of the system, carried out using a 100-talker database of digit utterances, shows that good talker recognition performance can be obtained for input utterances consisting of sequences of seven or more digits. Identification error rates varying from 3.6 to 14.0 percent for talker populations varying from 10 to 100 talkers are obtained. When the recognizer orders the talkers as candidates for recognition, the correct talker is found, on the average, among the top 0.8 percent of the population. Tested in a talker verification mode, the average error rate is approximately 8 percent.