A model distance measure for talker clustering and identification

This paper describes methods of talker clustering and identification based on a "distance" metric between discrete HMM output probabilities. Output probabilities are derived on a tree-based MMI partition of the feature space, rather than the usual vector quantization. The information divergence (relative entropy) between speaker-dependent models is used as a quantitative measure of how much a given talker differs from another talker. An immediate application is talker identification: an unknown speaker may be identified by finding the closest speaker-dependent reference model to a model trained on the unknown speaker's data. Another application is to cluster similar talkers into a group; these may be used to train a HMM model that represents that talker better than a more general model. It is shown that using the model "nearest" a novel talker enhances the performance of a talker-independent speech recognition system.<<ETX>>

[1]  Jonathan Trumbull Foote Decision-tree probability modeling for HMM speech recognition , 1994 .

[2]  L. R. Rabiner,et al.  A probabilistic distance measure for hidden Markov models , 1985, AT&T Technical Journal.

[3]  R. Schwartz,et al.  A new paradigm for speaker-independent training , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[5]  Lawrence G. Bahler,et al.  Voice identification using nearest-neighbor distance measure , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .