论文信息 - Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge

Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge

We attempt to combine neural networks with knowledge from speech science to build a speaker independent speech recognition system. This knowledge is utilized in designing the preprocessing, input coding, output coding, output supervision and architectural constraints. To handle the temporal aspect of speech we combine delays, copies of activations of hidden and output units at the input level, and Back-Propagation for Sequences (BPS), a learning algorithm for networks with local self-loops. This strategy is demonstrated in several experiments, in particular a nasal discrimination task for which the application of a speech theory hypothesis dramatically improved generalization.

[1] Yoshua Bengio,et al. Programmable execution of multi-layered networks for automatic speech recognition , 1989, CACM.

[2] Stephanie Seneff,et al. Pitch and spectral analysis of speech based on an auditory synchrony model , 1985 .

[3] Piero Cosi,et al. On the Generalization Capability of Multi-Layered Networks in the Extraction of Speech Properties , 1989, IJCAI.

[4] Bishnu S. Atal,et al. Efficient coding of LPC parameters by temporal decomposition , 1983, ICASSP.

[5] Barak A. Pearlmutter. Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[6] M. Gori,et al. BPS: a learning algorithm for capturing the dynamic nature of speech , 1989, International 1989 Joint Conference on Neural Networks.