Using neural networks and LPCC to improve speech recognition

Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Hsiao-Wuen Hon,et al.  An overview of the SPHINX speech recognition system , 1990, IEEE Trans. Acoust. Speech Signal Process..