论文信息 - Unified language modeling using finite-state transducers with first applications

Unified language modeling using finite-state transducers with first applications

In this paper, we investigate a weighted finite-state transducer approach to language modelling for speech recognition applications. We explore a unified framework to conversational speech recognition which combines the benefits of grammars, n-gram and class-based language models, with the flexibility of using dynamic data, and the potential for integrating semantics. Based on a virtual personal assistant application, we show first applications and recognition results of out-of-grammar handling and the integration of class-based, weighted, dynamic data into this framework.

Hans J. G. A. Dolfing | David Horowitz | Pierce Gerard Buckley

[1] Frédéric Béchet,et al. A language model combining n-grams and stochastic finite state automata , 1999, EUROSPEECH.

[2] Xuedong Huang,et al. A unified context-free grammar and n-gram model for spoken language processing , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3] Chin-Hui Lee,et al. Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition , 2003, INTERSPEECH.

[4] David Horowitz,et al. Conversational Dialogue Management in the FASiL project , 2004, SIGDIAL Workshop.

[5] Andrej Ljolje,et al. The AT&T LVCSR-2000 System , 2000 .

[6] Isabel Trancoso,et al. Transducer composition for "on-the-fly" lexicon and language model integration , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[7] Richard M. Schwartz,et al. Statistical Language Processing Using Hidden Understanding Models , 1994, HLT.

[8] Hans J. G. A. Dolfing,et al. Incremental language models for speech recognition using finite-state transducers , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[9] Steve J. Young,et al. Talking to machines (statistically speaking) , 2002, INTERSPEECH.

[10] Xavier L. Aubert,et al. An overview of decoding techniques for large vocabulary continuous speech recognition , 2002, Comput. Speech Lang..

[11] Chin-Hui Lee,et al. A speech understanding system based on statistical representation of semantics , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] David Horowitz,et al. A maximum entropy shallow functional parser for spoken language understanding , 2004, INTERSPEECH.

[13] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[14] Johan Schalkwyk,et al. Speech recognition with dynamic grammars using finite-state transducers , 2003, INTERSPEECH.

[15] Wolfgang Minker. Stochastically-based natural language understanding across tasks and languages , 1997, EUROSPEECH.