论文信息 - Automatic speaker identification by means of Mel cepstrum, wavelets and wavelet packets

Automatic speaker identification by means of Mel cepstrum, wavelets and wavelet packets

The present work consists on the use of delta cepstra coefficients in Mel scale, wavelet and wavelet packet transforms to feed a system for automatic speaker identification based on neural networks. Different alternatives are tested for the classifier based on neural nets, having achieved very good performance for closed groups of speakers in a text independent form. When a single neural net is used for all the speakers, the results decay abruptly with increasing number of speakers to identify. This takes to implement a system where there is one neural net for each speaker, which provided excellent results, compared with the opposing ones in the bibliography using other methods. This classifier structure possesses other advantages, for example, adding a new speaker to the system only requires to train a net for the speaker in question, in contrast with a system where the classifier is formed by a single great net, which should be in general be trained completely again.

Hugo Leonardo Rufiner | H. M. Torres

[1] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[2] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[3] Gérard Chollet,et al. Combining methods to improve speaker verification decision , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .

[5] Lou Boves,et al. A new procedure for classifying speakers in speaker verification systems , 1997, EUROSPEECH.

[6] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..