论文信息 - Name-aware speech recognition for interactive question answering

Name-aware speech recognition for interactive question answering

In this work we show how interactivity in a voice-enabled question answering application may improve speech recognition. We allow the user to provide a target named entity before asking the question. Then we build a named entity specific language model using the documents containing the named entity. The question-specific model is obtained by merging the named entity specific model with the model built on a set of questions. We present a set of experiments using the TREC question set on the AQUAINT corpus. The question-specific language model is compared with the baseline model built by merging a model of the AQUAINT corpus and past TREC questions. The question-specific model achieves 32.2% reduction in word error rate from the baseline using the questions where pronominal references are resolved.

Gökhan Tür | Dilek Z. Hakkani-Tür | Svetlana Stoyanchev | Gökhan Tür | Svetlana Stoyanchev

[1] Sarangarajan Parthasarathy,et al. Experiments in keypad-aided spelling recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Dilek Z. Hakkani-Tür,et al. Mining Spoken Dialogue Corpora for System Evaluation and Modelin , 2004, EMNLP.

[4] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[5] Dilek Z. Hakkani-Tür,et al. Active and unsupervised learning for automatic speech recognition , 2003, INTERSPEECH.

[6] Ellen M. Voorhees,et al. Overview of the TREC 2004 Novelty Track. , 2005 .

[7] Xie Kanglin. Lucene Search Engine , 2007 .

[8] Andreas Stolcke,et al. DynaSpeak: SRI's scalable speech recognizer for embedded and mobile systems , 2002 .

[9] Victor Zue,et al. JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[10] Hauke Schramm,et al. Strategies for name recognition in automatic directory assistance systems , 2000, Speech Commun..

[11] Steven Skiena,et al. Lydia: A System for Large-Scale News Analysis , 2005, SPIRE.

[12] Thomas Hofmann,et al. Topic-based language models using EM , 1999, EUROSPEECH.

[13] Jimmy J. Lin,et al. Overview of the TREC 2007 Question Answering Track , 2008, TREC.

[14] Hoa Trang Dang,et al. Overview of the TREC 2006 Question Answering Track 99 , 2006, TREC.

[15] Mari Ostendorf,et al. Modeling long distance dependence in language: topic mixtures versus dynamic cache models , 1996, IEEE Trans. Speech Audio Process..