Neural Networks for Speech Recognition

This paper describes a multi-layer neural network model, based on Kohonen’s algorithm, for which a physiologically-based cochlear model acts as a front-end processor. Sixty-dimensional spectral’vectors, produced by the cochlear model, act as inputs to the first layer. Simulations on a 9*9 neural array for the first layer were carried out for the digits 0 to 9. Results show that the network produces similar trajectories for different utterances of the same word, while producing different trajectories for different words. The network also exhibits time warp invariance, a property which is desirable in speech recognition systems. Finally, a second neural array which accepts inputs from the first array is described.