论文信息 - Biologically inspired features used for robust phoneme recognition

Biologically inspired features used for robust phoneme recognition

Formants are regarded as the basic building blocks of vowels; however, they are very rarely used as features for difficult automatic speech recognition tasks. Formant-based research is generally focused on formant extraction, because of the assumption that a better formant extraction method is the only manner to increase the effectiveness of formants. We challenge this assumption by investigating a different use of formants following their extraction. By using the same principles of combining formants as observed in speech perception studies, we create features that show good recognition performance under noisy testing conditions. Improved recognition performance with the proposed formant features is demonstrated by comparing to Mel-frequency cepstrum coefficients and perceptual linear predictive coding features on a hidden Markov model-based automatic speech recognition system.

Alex Pappachen James | Sima Dimitrijev | Mitar Milacic

[1] A Yamadori,et al. Formant Interaction as a Cue to Vowel Perception: A Case Report , 2003, Neurocase.

[2] Hugo Van hamme,et al. A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition , 2007, EURASIP J. Adv. Signal Process..

[3] G. E. Peterson,et al. Control Methods Used in a Study of the Vowels , 1951 .

[4] Hsiao-Wuen Hon,et al. Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5] Philip N. Garner,et al. Using formant frequencies in speech recognition , 1997, EUROSPEECH.

[6] Russell J. Niederjohn,et al. A zero-crossing consistency method for formant tracking of voiced speech in high noise levels , 1985, IEEE Trans. Acoust. Speech Signal Process..

[7] Hynek Hermansky,et al. Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech , 2008, INTERSPEECH.

[8] H Scheich,et al. Orderly cortical representation of vowels based on formant interaction. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[9] Qin Yan,et al. A formant tracking LP model for speech processing , 2004, INTERSPEECH.

[10] Hermann Ney,et al. Formant estimation for speech recognition , 1998, IEEE Trans. Speech Audio Process..