Fast adaptive component weighted cepstrum pole filtering for speaker identification

Mismatched training and testing conditions for speaker identification exist when speech is subjected to a different channel for the two cases. This results in diminished speaker identification performance. Finding features that show little variability to the filtering effect of different channels will make speaker identification systems more robust thereby achieving a better performance. It has been shown that subtracting the mean of the pole filtered linear predictive (LP) cepstrum from the actual LP cepstrum results in a robust feature. This feature is known as the pole filtered mean removed LP cepstrum. Another robust feature is the adaptive component weighted (ACW) cepstrum particularly with mean removal. In this paper, we combine the ACW cepstrum with the pole filtering concept to configure a more robust new feature, namely, the pole filtered mean removed ACW cepstrum. This new method is fast and shows a higher performance then the pole filtered mean removed LP cepstrum and the mean removed ACW cepstrum. Experimental results are given for the TIMIT database involving a variety of mismatched conditions.

[1]  Richard J. Mammone,et al.  Speaker recognition - general classifier approaches and data fusion methods , 2002, Pattern Recognit..

[2]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[3]  Richard J. Mammone,et al.  Channel normalization using pole-filtered cepstral mean subtraction , 1994, Optics & Photonics.

[4]  Ravi P. Ramachandran,et al.  Fast pole filtering for speaker recognition , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[5]  Richard J. Mammone,et al.  Speaker identification based on the use of robust cepstral features obtained from pole-zero transfer functions , 1998, IEEE Trans. Speech Audio Process..

[6]  R. P. Ramachandran,et al.  Robust speaker recognition: a feature-based approach , 1996, IEEE Signal Processing Magazine.

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  B. Atal Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[9]  A.E. Rosenberg,et al.  Automatic speaker verification: A review , 1976, Proceedings of the IEEE.

[10]  Richard J. Mammone,et al.  A fast algorithm for finding the adaptive component weighted cepstrum for speaker recognition , 1997, IEEE Trans. Speech Audio Process..

[11]  Richard J. Mammone,et al.  New LP-derived features for speaker identification , 1994, IEEE Trans. Speech Audio Process..

[12]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[13]  Devang Naik,et al.  Pole-filtered cepstral mean subtraction , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[14]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..