Improved language modeling for conversational applications using sentence quality

In this paper, we propose a new approach to build language models for conversationals system using a a corpus of text as a opposed to a live or a Wizard-of-Oz collection. Each sentence in the corpus is assigned a “quality” that reflects the developer's intuition for how likely that sentence is to be spoken by a real user to the live system. Language Models (LM) are built for each sentence quality and these are subsequently interpolated to produce the final model. We also have built a classifier that assigns sentence qualities to the data, and whose subsequent language models achive similar improvements in word and turn error rate.

[1]  Jianfeng Gao,et al.  Training data optimization for language model adaptation , 2003, INTERSPEECH.

[2]  Sanjeev Khudanpur,et al.  Maximum entropy language modeling with non-local dependencies , 2003 .

[3]  Mari Ostendorf,et al.  Modeling long distance dependence in language: topic mixtures vs. dynamic cache models , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Ronald Rosenfeld,et al.  Using story topics for language model adaptation , 1997, EUROSPEECH.

[5]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[6]  Andreas Stolcke,et al.  Web resources for language modeling in conversational speech recognition , 2007, TSLP.

[7]  Wayne H. Ward,et al.  A language model combining trigrams and stochastic context-free grammars , 1998, ICSLP.

[8]  Ruhi Sarikaya,et al.  Rapid language model development using external resources for new spoken dialog domains , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Steve Renals,et al.  Document space models using latent semantic analysis , 1997, EUROSPEECH.

[10]  Sanjeev Khudanpur,et al.  Language model adaptation for automatic speech recognition and statistical machine translation , 2005 .

[11]  Mari Ostendorf,et al.  Modeling long distance dependence in language: topic mixtures versus dynamic cache models , 1996, IEEE Trans. Speech Audio Process..

[12]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[13]  Xuedong Huang,et al.  A unified context-free grammar and n-gram model for spoken language processing , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[14]  Xuedong Huang,et al.  Improved topic-dependent language modeling using information retrieval techniques , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Vaibhava Goel,et al.  Exploiting unlabeled data using multiple classifiers for improved natural language call-routing , 2005, INTERSPEECH.

[16]  Dong Yu,et al.  Maximum entropy based generic filter for language model adaptation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[17]  Stephanie Seneff,et al.  Language model data filtering via user simulation and dialogue resynthesis , 2005, INTERSPEECH.

[18]  Jing Huang,et al.  Effective acoustic adaptation for a distant-talking interactive TV system , 2008, INTERSPEECH.

[19]  Michael Picheny,et al.  Using semantic analysis to improve speech recognition performance , 2005, Comput. Speech Lang..

[20]  Ruhi Sarikaya,et al.  Rapid bootstrapping of statistical spoken dialogue systems , 2008, Speech Commun..

[21]  Anthony J. Robinson,et al.  Language model adaptation using mixtures and an exponentially decaying cache , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  S. U U M M M M A A R R Y Y Y Y E E Distant-talking Interfaces for Control of Interactive TV Publishable Executive Summary Year 2 , 2022 .