Training Statistical Language Models from Grammar-Generated Data: A Comparative Case-Study

Statistical language models (SLMs) for speech recognition have the advantage of robustness, and grammar-based models (GLMs) the advantage that they can be built even when little corpus data is available. A known way to attempt to combine these two methodologies is first to create a GLM, and then use that GLM to generate training data for an SLM. It has however been difficult to evaluate the true utility of the idea, since the corpus data used to create the GLM has not in general been explicitly available. We exploit the Open Source Regulus platform, which supports corpus-based construction of linguistically motivated GLMs, to perform a methodologically sound comparison: the same data is used both to create an SLM directly, and also to create a GLM, which is then used to generate data to train an SLM. An evaluation on a medium-vocabulary task showed that the indirect method of constructing the SLM is in fact only marginally better than the direct one. The method used to create the training data is critical, with PCFG generation heavily outscoring CFG generation.

[1]  Manny Rayner,et al.  Comparing grammar-based and robust approaches to speech understanding: a case study , 2001, INTERSPEECH.

[2]  Johan Bos Compilation of Unification Grammars with Compositional Semantics to Speech Recognition Packages , 2002, COLING.

[3]  Beth Ann Hockey,et al.  Practical Issues in Compiling Typed Unification Grammars for Speech Recognition , 2001, ACL.

[4]  Andreas Stolcke,et al.  Using a stochastic context-free grammar as a language model for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Brendan J. Frey,et al.  Combination of statistical and rule-based approaches for spoken language understanding , 2002, INTERSPEECH.

[6]  Hitoshi Isahara,et al.  Developing Non-European Translation Pairs in a Medium-Vocabulary Medical Speech Translation System , 2008, LREC.

[7]  Robert C. Moore Using Natural-Language Knowledge Sources in Speech Recognition , 1999 .

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Beth Ann Hockey,et al.  A baseline method for compiling typed unification grammars into context free language models , 2001, INTERSPEECH.

[10]  Hitoshi Isahara,et al.  A generic multi-lingual open source platform for limited-domain medical speech translation , 2005, EAMT.

[11]  Beth Ann Hockey,et al.  Putting Linguistics into Speech Recognition: The Regulus Grammar Compiler (Studies in Computational Linguistics (Stanford, Calif.).) , 2006 .

[12]  Amanda Stent,et al.  The CommandTalk Spoken Dialogue System , 1999, ACL.

[13]  Beth Ann Hockey,et al.  Evaluating Task Performance for a Unidirectional Controlled Language Medical Speech Translation System , 2006 .

[14]  Hitoshi Isahara,et al.  A methodology for comparing grammar-based and robust approaches to speech understanding , 2005, INTERSPEECH.

[15]  Rebecca Jonson Generating Statistical Language Models from Interpretation Grammars in Dialogue Systems , 2006, EACL.

[16]  Beth Ann Hockey,et al.  A Voice Enabled Procedure Browser for the International Space Station , 2005, ACL.

[17]  Ye-Yi Wang,et al.  Is word error rate a good indicator for spoken language understanding accuracy , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).