Speaker identification using Hidden Conditional Random Field-based speaker models

In this paper we make a study of applying Hidden Conditional Random Fields (HCRF) to establish speaker models. A novel training algorithm combining the discriminative training criterion with HCRF for speaker identification is proposed. This work also adopted discriminative training technique to train GMM, HMM, and HCRF speaker models respectively; and the performance of speaker identification by the three speaker models with different amounts of training speech for clean and noisy testing speech were investigated. The experimental results indicate that the HCRF model consistently achieved the lowest error rate among the three models regardless of the length of the test and training speech and presence of noise.

[1]  S. Katagiri,et al.  Discriminative Learning for Minimum Error Classification , 2009 .

[2]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[3]  Aaron E. Rosenberg,et al.  Speaker identification using minimum classification error training , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Chiu-yu Tseng,et al.  MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database , 2000, INTERSPEECH.

[5]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[6]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[7]  An improvement of the GMM Speaker Identification Method by using Two-state HMM and Discriminative Training , 2002 .

[8]  Alex Acero,et al.  Training Algorithms for Hidden Conditional Random Fields , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Alex Acero,et al.  Hidden conditional random fields for phone classification , 2005, INTERSPEECH.

[10]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).