论文信息 - Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments

Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments

This paper presents an evaluation of the use of some auditorybased distinctive features and formant cues for robust automatic speech recognition (ASR) in the presence of highly interfering car noise. Comparative experiments have indicated that combining the classical MFCCs with some auditory-based acoustic distinctive cues and either the main formant magnitudes or the formant frequencies of a speech signal using a multi-stream paradigm leads to an improvement in the recognition performance in noisy car environments. To test the use of the new multi-stream feature vector, a series of experiments on speaker-independent continuous-speech recognition have been carried out using a noisy version of the TIMIT database. Using such multi-stream paradigm, we found that the use of the proposed paradigm, outperforms the conventional recognition process based on the use of the MFCCs in interfering noisy car environments for a wide range of SNRs.

Douglas D. O'Shaughnessy | Sid-Ahmed Selouani | Hesham Tolba

[1] Jean-Claude Junqua,et al. Robustness in Automatic Speech Recognition , 1996 .

[2] Douglas D. O'Shaughnessy,et al. Speech communication : human and machine , 1987 .

[3] Guy Perennou,et al. Structuration des informations acoustiques dans le projet A.R.I.A.L , 1983, Speech Commun..

[4] Douglas D. O'Shaughnessy,et al. Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] M. Halle,et al. Preliminaries to Speech Analysis: The Distinctive Features and Their Correlates , 1961 .

[6] Douglas D. O'Shaughnessy,et al. Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for automatic speech recognition using a multi-stream paradigm , 2002, INTERSPEECH.

[7] Jean Caelen. Space/time data-information in the A.R.I.A.L. project ear model , 1985, Speech Commun..