论文信息 - A voice search approach to replying to SMS messages in automobiles

A voice search approach to replying to SMS messages in automobiles

Automotive infotainment systems now provide drivers the ability to hear incoming Short Message Service (SMS) text messages using text-to-speech. However, the question of how best to allow users to respond to these messages using speech recognition remains unsettled. In this paper, we propose a robust voice search approach to replying to SMS messages based on template matching. The templates are empirically derived from a large SMS corpus and matches are accurately retrieved using a vector space model. In evaluating SMS replies within the acoustically challenging environment of automobiles, the voice search approach consistently outperformed using just the recognition results of a statistical language model or a probabilistic context-free grammar. For SMS replies covered by our templates, the approach achieved as high as 89.7% task completion when evaluating the top five reply candidates.

Tim Paek | Yun-Cheng Ju

[1] Alexander H. Waibel,et al. Multimodal error correction for speech user interfaces , 2001, TCHI.

[2] R. Rosenfeld,et al. Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[3] J.G. Wilpon,et al. Intelligent virtual agents for contact center automation , 2005, IEEE Signal Processing Magazine.

[4] Yu Shi,et al. Towards spoken-document retrieval for the enterprise: Approximate word-lattice indexing with text indexers , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[5] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[6] Tim Paek,et al. The effect of speech interface accuracy on driving performance , 2007, INTERSPEECH.

[7] Richard M. Schwartz,et al. A scalable architecture for Directory Assistance automation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Ivan Tashev,et al. Unified framework for single channel speech enhancement , 2009, 2009 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[9] Yun-Cheng Ju,et al. A language-modeling approach to inverse text normalization and data cleanup for multimodal voice search applications , 2008, INTERSPEECH.

[10] Geoffrey Zweig,et al. Automated directory assistance system - from theory to practice , 2007, INTERSPEECH.

[11] Alex Acero,et al. Call analysis with classification using speech and non-speech features , 2006, INTERSPEECH.

[12] Dong Yu,et al. An introduction to voice search , 2008, IEEE Signal Processing Magazine.

[13] Alexander I. Rudnicky,et al. Universal speech interfaces , 2001, INTR.

[14] Bo Thiesson,et al. Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search , 2008, UIST '08.

[15] Ute Ehrlich,et al. How to access audio files of large data bases using in-car speech dialogue systems , 2007, INTERSPEECH.