论文信息 - Using semantic class information for rapid development of language models within ASR dialogue systems

Using semantic class information for rapid development of language models within ASR dialogue systems

When dialogue system developers tackle a new domain, much effort is required; the development of different parts of the system usually proceeds independently. Yet it may be profitable to coordinate development efforts between different modules. We focus our efforts on extending small amounts of language model training data by integrating semantic classes that were created for a natural language understanding module. By converting finite state parses of a training corpus into a probabilistic context free grammar and subsequently generating artificial data from the context free grammar, we can significantly reduce perplexity and automatic speech recognition (ASR) word error for situations with little training data. Experiments are presented using data from the ATIS and DARPA Communicator travel corpora.

Eric Fosler-Lussier | Hong-Kwang Jeff Kuo

[1] Giuseppe Riccardi,et al. Grammar Fragment acquisition using syntactic and semantic clustering , 1998, Speech Commun..

[2] James R. Glass,et al. Empirical acquisition of word and phrase classes in the atis domain , 1993, EUROSPEECH.

[3] Hong-Kwang Jeff Kuo,et al. Statistical recursive finite state machine parsing for speech understanding , 2000, INTERSPEECH.

[4] Helen M. Meng,et al. Semi-automatic acquisition of domain-specific semantic structures , 1999, EUROSPEECH.

[5] P. J. Price,et al. Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.