Speaker verification using temporal decorrelation post-processing

A text-dependent method of speaker verification processing which utilizes the statistical correlation between measured features of speech across whole words is described. The correlation is used in a linear discriminant analysis to define uncorrelated world-level features as a metric. Initial results indicate that this method can significantly reduce the amount of storage necessary for speaker-specific speech information. Furthermore, this method provides promise of improved verification performance compared to methods based on hidden Markov model (HMM) state level observation metrics. Since the linear discriminant analysis yields features which are decorrelated over entire words, this method should be more robust to signal distortions which are consistent over the entire utterance.<<ETX>>

[1]  Sadaoki Furui,et al.  A text-independent speaker recognition method robust against utterance variations , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  G. Doddington,et al.  High performance speaker verification using principal spectral components , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Richard Rose,et al.  Robust speaker identification in noisy environments using noise adaptive speaker models , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  George R. Doddington Phonetically sensitive discriminants for improved speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  George R. Doddington,et al.  Speaker verification over long distance telephone lines , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[6]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .