Normalizing internal representations for speech classification

Speech segments are encoded using an autoassociative network, and the possibility of matching the hidden unit activation sequences for classification is studied. Good discrimination can be achieved by matching the lower dimensional projections of the unknown with template speech patterns. The possibility of normalizing the variations of the hidden unit activation sequences is then explored. In particular, experiments demonstrate the advantage of the presented technique for single- and multi-speaker syllable distinction tasks. Normalisation of the encoded representations of sounds within classes and across speakers improves results significantly.<<ETX>>

[1]  Richard P. Lippmann,et al.  Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[2]  Suzanna Becker,et al.  Unsupervised Learning Procedures for Neural Networks , 1991, Int. J. Neural Syst..

[3]  Richard Lippmann,et al.  Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[4]  D Zipser,et al.  Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.

[5]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.