Speech coding with multilayer networks

Combining a structural or knowledge-based approach for describing speech units with neural networks capable of automatically learning relations between acoustic properties and speech units is investigated. The authors show how speech coding can be performed by sets of multilayer neural networks whose execution is decided by a data-driven strategy. Coding is based on phonetic properties characterizing a large population of speakers. Results on speaker-independent recognition of vowels using an ear model for preprocessing are reported. >

[1]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Piero Cosi,et al.  On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties , 1989, IJCAI.

[3]  Renato De Mori,et al.  Learning and Plan Refinement in a Knowledge-Based System for Automatic Speech Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Pietro Laface,et al.  Parallel Algorithms for Syllable Recognition in Continuous Speech , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985 .

[6]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[7]  M. Gori,et al.  BPS: a learning algorithm for capturing the dynamic nature of speech , 1989, International 1989 Joint Conference on Neural Networks.

[8]  Geoffrey E. Hinton,et al.  Learning sets of filters using back-propagation , 1987 .

[9]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[10]  Lalit R. Bahl,et al.  Speech recognition with continuous-parameter hidden Markov models , 1987, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Jean Rouat,et al.  Use of Procedural Knowledge for Automatic Speech Recognition , 1987, IJCAI.

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  Alex Waibel,et al.  Phoneme recognition: neural networks vs. hidden Markov models vs. hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[14]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[15]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .