Speech coding with multi-layer networks

Combining a structural or knowledge-based approach for describing speech units with neural networks capable of automatically learning relations between acoustic properties and speech units is investigated. The authors show how speech coding can be performed by sets of multilayer neural networks whose execution is decided by a data-driven strategy. Coding is based on phonetic properties characterizing a large population of speakers. Results on speaker-independent recognition of vowels using an ear model for preprocessing are reported.<<ETX>>

[1]  Geoffrey E. Hinton,et al.  Learning sets of filters using back-propagation , 1987 .

[2]  Stephanie Seneff,et al.  Pitch and spectral analysis of speech based on an auditory synchrony model , 1985 .

[3]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jean Rouat,et al.  Use of Procedural Knowledge for Automatic Speech Recognition , 1987, IJCAI.

[5]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[6]  Lalit R. Bahl,et al.  Speech recognition with continuous-parameter hidden Markov models , 1987, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  Alex Waibel,et al.  Phoneme recognition: neural networks vs. hidden Markov models vs. hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[8]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985, Proceedings of the IEEE.

[9]  Pietro Laface,et al.  Parallel Algorithms for Syllable Recognition in Continuous Speech , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.