论文信息 - Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification

Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification

This paper describes our recent efforts in exploring datadriven high-level features and their combination with low-level spectral features for speaker verification. In particular, we compare the phonetic and data-driven approaches and study their complementarity with short-term acoustic approach. Our objective is to show that data-driven units automatically acquired from the speech data, can be used like phonemes to extract highlevel features and to bring complementary speaker-specific information that can therefore provide improvements when fused with acoustic systems. Results obtained on the NIST 2006 Speaker Recognition Evaluation data show that the combination of the phonetic, data-driven and Gaussian Mixture Models (GMM) systems brings a 27% relative reduction of the EER in comparison to the baseline GMM system.

Asmaa El Hannani | Dijana Petrovska-Delacrétaz

[1] Douglas A. Reynolds,et al. Fusing high- and low-level features for speaker recognition , 2003, INTERSPEECH.

[2] Sara H. Basson,et al. NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3] Andreas Stolcke,et al. Modeling duration patterns for speaker recognition , 2003, INTERSPEECH.

[4] Asmaa El Hannani,et al. Exploiting High-Level Information Provided by ALISP in Speaker Recognition , 2005, NOLISP.

[5] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[6] Douglas A. Reynolds,et al. The SuperSID project: exploiting high-level information for high-accuracy speaker recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] Asmaa El Hannani,et al. Improving Speaker Verification Using ALISP-Based Specific GMMs , 2005, AVBPA.

[8] Julian Fiérrez,et al. Support vector machine fusion of idiolectal and acoustic speaker information in Spanish conversational speech , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[9] Doroteo Torre Toledano,et al. Using Data-driven and Phonetic Units for Speaker Verification , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[10] Gérard Chollet,et al. Toward ALISP: A proposal for Automatic Language Independent Speech Processing , 1999 .