Speaker Verification System Based on the Stochastic Modeling

In this paper we propose a new speaker verification system where the new training and classification algorithms for vector quantization and Gaussian mixture models are introduced. The vector quantizer is used to model sub-word speech components. The code books are created for both training and test utterances. We propose new approaches to normalize distortion of the training and test code books. The test code book quantized over the training code book. The normalization technique includes assigning the equal distortion for training and test code books, distortion normalization and cluster weights. Also the LBG and K-means algorithms usually employed for vector quantization are implemented to train Gaussian mixture models. And finally, we use the information provided by two different models to increase verification performance. The performance of the proposed system has been tested on the Speaker Recognition database, which consists of telephone speech from 8 participants. The additional experiments has been performed on the subset of the NIST 1996 Speaker Recognition database which include .

[1]  David K. Burton,et al.  Text-dependent speaker verification using vector quantization source coding , 1985, IEEE Trans. Acoust. Speech Signal Process..

[2]  Gérard Chollet,et al.  Neural net approaches to speaker verification: comparison with second order statistic measures , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Sridha Sridharan,et al.  Vector quantization based Gaussian modeling for speaker verification , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  W.J.J. Roberts,et al.  Automatic speaker recognition using Gaussian mixture models , 1999, 1999 Information, Decision and Control. Data and Information Fusion Symposium, Signal Processing and Communications Symposium and Decision and Control Symposium. Proceedings (Cat. No.99EX251).

[5]  Chun-Nan Hsu,et al.  Speaker verification without background speaker models , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  R.J. Mammone,et al.  Sub-word speaker verification using data fusion methods , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[7]  Richard J. Mammone,et al.  An analysis of data fusion methods for speaker verification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  S. Kosonocky,et al.  Neural tree network/vector quantization probability estimators for speaker recognition , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[9]  Thambipillai Srikanthan,et al.  Vector quantization techniques for GMM based speaker verification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  R.D. Zilca Text-independent speaker verification using covariance modeling , 2001, IEEE Signal Processing Letters.