Speech processing using the average localized synchrony detection
暂无分享,去创建一个
A new auditory‐based speech‐processing system based on the biologically rooted property of average localized synchrony detection (ALSD) is proposed. The system is a modification to the generalized synchrony detector (GSD) [S. Seneff, J. Phonetics 16, 55–76 (1988)]. It generates a pseudospectrogram of the speech signal by detecting periodicity, while reducing the response to the individual harmonics of the fundamental frequency and the sensitivity to implementation mismatches. This is achieved without sacrificing the frequency resolution. Hence, it presents a more consistent and robust representation of the formants. The system is evaluated for its formant extraction ability while reducing spurious peaks. It is compared with other auditory‐based front‐end processing systems in the tasks of vowel, stop and fricative classification on clean speech from the TIMIT database, and in the presence of noise. The results illustrate the advantage of the ALSD system in extracting the formants and reducing the spurious peaks, while preserving the frequency resolution. They also indicate the superiority of the synchrony measures over the mean rate in the presence of noise. [Work supported by Catalyst Foundation.]