A fully recurrent neural network for recognition of noisy telephone speech

For a variety of telephone applications it is sufficient to realize a speech recognition system (SRS) with a system vocabulary consisting of a few command words, digits, and connected digits. However, in the development of a SRS for application in telephone environment it has to be considered that the speech is bandpass limited and a high recognition performance has to be guaranteed under speaker independent and even adverse conditions. Furthermore, it is important that the SRS is efficiently implementable. Fully recurrent neural networks (FRNN) provide a new approach for realizing a robust SRS with a single network. FRNN are able to perform the process of feature scoring discriminatively and independently of the length of the feature sequence. In SRS based on Hidden Markov Models (HMM), different methods have to be applied for scoring the feature vectors and for compensating the variations in phone durations. Here we report about investigations to realize a monolithic SRS based on FRNN for telephone speech. Besides isolated word recognition, the capability of FRNN-SRS to deal with connected digit recognition is presented. Furthermore, it is shown how FRNN could be immunized against several types of additive noise.

[1]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[2]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[3]  H. Wust,et al.  A monolithic speech recognizer based on fully recurrent neural networks , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[4]  Ravi Sankar,et al.  Noise immunization using neural net for speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..