A hybrid HMM-MLP speaker verification algorithm for telephone speech

This paper describes the results of experiments to investigate the integration of MLP (multilayer perceptron) and HMM (hidden Markov modeling) techniques in the task of fixed-text speaker verification. A large speech database collected over the telephone network was used to evaluate the algorithm. Speech data for each speaker was automatically segmented using a supervised HMM-Viterbi decoding scheme and an MLP was trained with this segmented data. The output scores of the MLP, after appropriate scaling were used as observation probabilities in a Viterbi realignment and scoring step. Intra-speaker and inter-speaker scores were generated by training the HMM-MLP system for each speaker and testing against speech data for the same speaker and against all other speakers, who shared utterances of identical text. Our results show that MLP classifiers combined with HMMs improve speaker discrimination by 20% over conventional HMM algorithms for speaker verification.<<ETX>>

[1]  G. Doddington,et al.  High performance speaker verification using principal spectral components , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Michael Witbrock,et al.  A connectionist approach to continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[3]  George R. Doddington,et al.  Speaker verification over long distance telephone lines , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Hervé Bourlard,et al.  Speech pattern discrimination and multilayer perceptrons , 1989 .

[5]  Harvey F. Silverman,et al.  Combining hidden Markov model and neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[7]  J.M. Naik,et al.  Speaker verification: a tutorial , 1990, IEEE Communications Magazine.

[8]  Chin-Hui Lee,et al.  Connected digit recognition based on improved acoustic resolution , 1993, Comput. Speech Lang..

[9]  Yoshua Bengio A Connectionist Approach to Speech Recognition , 1993, Int. J. Pattern Recognit. Artif. Intell..