Experiments in spoken queries for document retrieval

We report the results of three experiments using the errorful output of a large vocabulary continuous speech recognition (LVCSR) system as the input to a statistical information retrieval (IR) system. Our goal is to allow a user to speak, rather than type, query terms into an IR engine and still obtain relevant documents. The purpose of these experiments is to test whether IR systems are robust to errors in the query terms introduced by the speech recognizer. If the correctly recognized words in the search query outweigh the misinformation from the incorrectly recognized words, the relevant documents will still be retrieved. This paper presents evidence that speech-driven IR can be effective, although with a reduced precision. We also find that longer spoken queries produce higher precision retrieval than shorter queries. For queries containing many (50-60) search terms and a recognizer word error rate (WER) of 27.9%, the precision at 30 documents retrieved is degraded by only 11.1%. For roughly the same WER, however, we find that queries shorter than 10-15 words suffer more than a 30% loss of precision.

[1]  Karen Spärck Jones,et al.  Robust talker-independent audio document retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  Karen Spärck Jones,et al.  Acoustic indexing for multimedia retrieval and browsing , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Howard D. Wactlar,et al.  INFORMEDIATM: NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION , 1998 .

[4]  Vijay Balasubramanian,et al.  Speech-Based Retrieval Using Semantic Co-Occurrence Filtering , 1994, HLT.

[5]  W. Bruce Croft,et al.  TREC and Tipster Experiments with Inquery , 1995, Inf. Process. Manag..

[6]  Sean Connolly,et al.  Improvements in switchboard recognition and topic identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  David A. James,et al.  A system for unrestricted topic retrieval from radio news broadcasts , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Donna K. Harman,et al.  The DARPA TIPSTER project , 1992, SIGF.

[9]  Karen Spärck Jones,et al.  Video mail retrieval: the effect of word spotting accuracy on precision , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Howard D. Wactlar,et al.  Indexing and search of multimodal information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.