This paper describes methods of talker clustering and identification based on a "distance" metric between discrete HMM output probabilities. Output probabilities are derived on a tree-based MMI partition of the feature space, rather than the usual vector quantization. The information divergence (relative entropy) between speaker-dependent models is used as a quantitative measure of how much a given talker differs from another talker. An immediate application is talker identification: an unknown speaker may be identified by finding the closest speaker-dependent reference model to a model trained on the unknown speaker's data. Another application is to cluster similar talkers into a group; these may be used to train a HMM model that represents that talker better than a more general model. It is shown that using the model "nearest" a novel talker enhances the performance of a talker-independent speech recognition system.<<ETX>>
[1]
Jonathan Trumbull Foote.
Decision-tree probability modeling for HMM speech recognition
,
1994
.
[2]
L. R. Rabiner,et al.
A probabilistic distance measure for hidden Markov models
,
1985,
AT&T Technical Journal.
[3]
R. Schwartz,et al.
A new paradigm for speaker-independent training
,
1991,
[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[4]
S.,et al.
An Efficient Heuristic Procedure for Partitioning Graphs
,
2022
.
[5]
Lawrence G. Bahler,et al.
Voice identification using nearest-neighbor distance measure
,
1993,
1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6]
Thomas M. Cover,et al.
Elements of Information Theory
,
2005
.