Identification of speakers engaged in dialog

The approach developed is based on the robust evaluation of likelihoods based on speech segments. The method shows that speakers can be identified with minimal loss of performance in the presence of large amounts of undesired speech. The authors consider the case where there are models for only one of the two speakers and the case where one is interested in identifying both speakers. The role that clustering can play in this problem is also discussed. It is demonstrated that robust identification methods can be very effective in obtaining high correct identification rates. The methods presented should have applicability to the situation of multiple (more than two) speakers engaged in a conference as well as those speaker identification situations where interference is a problem. Clustering is seen to be a viable alternative to robust methods and can be effective without normalization.<<ETX>>

[1]  Herbert Gish,et al.  Segregation of speakers for speech recognition and speaker identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[4]  H. Gish,et al.  An unsupervised, sequential learning algorithm for the segmentation of speech waveforms with multiple speakers , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Herbert Gish,et al.  Methods and experiments for text-independent speaker recognition over telephone channels , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.