Structured Output Layer neural network language model

This paper introduces a new neural network language model (NNLM) based on word clustering to structure the output vocabulary: Structured Output Layer NNLM. This model is able to handle vocabularies of arbitrary size, hence dispensing with the design of short-lists that are commonly used in NNLMs. Several softmax layers replace the standard output layer in this model. The output structure depends on the word clustering which uses the continuous word representation induced by a NNLM. The GALE Mandarin data was used to carry out the speech-to-text experiments and evaluate the NNLMs. On this data the well tuned baseline system has a character error rate under 10%. Our model achieves consistent improvements over the combination of an n-gram model and classical short-list NNLMs both in terms of perplexity and recognition accuracy.

[1]  Jean-Luc Gauvain,et al.  MODELING CHARACTERS VERSUS WORDS FOR MANDARIN SPEECH RECOGNITION , 2009 .

[2]  Ahmad Emami,et al.  Morphological and syntactic features for Arabic speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Jean-Luc Gauvain,et al.  Connectionist language modeling for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Holger Schwenk,et al.  Continuous space language models , 2007, Comput. Speech Lang..

[5]  Mark J. F. Gales,et al.  Improved neural network based language modelling and adaptation , 2010, INTERSPEECH.

[6]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[7]  Bhuvana Ramabhadran,et al.  Scaling shrinkage-based language models , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[8]  Jean-Luc Gauvain,et al.  Improving Mandarin Chinese STT system with Random Forests language models , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[9]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[10]  Alexandre Allauzen,et al.  Training Continuous Space Language Models: Some Practical Issues , 2010, EMNLP.

[11]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[12]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.