Biologically inspired features used for robust phoneme recognition

Formants are regarded as the basic building blocks of vowels; however, they are very rarely used as features for difficult automatic speech recognition tasks. Formant-based research is generally focused on formant extraction, because of the assumption that a better formant extraction method is the only manner to increase the effectiveness of formants. We challenge this assumption by investigating a different use of formants following their extraction. By using the same principles of combining formants as observed in speech perception studies, we create features that show good recognition performance under noisy testing conditions. Improved recognition performance with the proposed formant features is demonstrated by comparing to Mel-frequency cepstrum coefficients and perceptual linear predictive coding features on a hidden Markov model-based automatic speech recognition system.