Self-organizing letter code-book for text-to-phoneme neural network model

This paper describes an improved input coding method for a textto-phoneme (TTP) neural network model for speaker independent speech recognition systems. The code-book is self-organizing and is jointly optimized with the TTP model ensuring that the coding is optimal in terms of overall performance. The codebook is based on a set of single layer neural networks with shared weights. Experiments show that performance is increased compared to the NETTalk and NETSpeak models.

[1]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[2]  Mark Bedworth,et al.  NETspeak — A re-implementation of NETtalk , 1987 .

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  S. K. Riis,et al.  Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. , 1996, Journal of computational biology : a journal of computational molecular cell biology.

[5]  Yoshua Bengio,et al.  Reading checks with multilayer graph transformer networks , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Alan W. Black,et al.  Letter to sound rules for accented lexicon compression , 1998, ICSLP.

[7]  S. D. Hansen,et al.  Hidden Markov models and neural networks for speech recognition , 1999 .

[8]  Olli Viikki,et al.  Low complexity speaker independent command word recognition in car environments , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).