Robust talker-independent audio document retrieval

The goal of the video mail retrieval (VMR) project is to integrate state-of-the-art document retrieval methods with speech recognition to yield a robust and efficient retrieval system. The work presented extends VMR towards an open-vocabulary, talker-independent system for retrieving spontaneously-spoken audio and video messages. We present results showing successful retrieval using a standard large-vocabulary (LV) recogniser, despite the lack of a matched language model and vocabulary. We further show that integrating a LV recogniser with conventional word spotting (WS) gives more robust retrieval performance than either method alone. This paper gives details of the message archive used, the speech recognition methodologies, the information retrieval methods, and experimental results.

[1]  Steve J. Young,et al.  Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Karen Spärck Jones,et al.  Video mail retrieval: the effect of word spotting accuracy on precision , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Steve J. Young,et al.  Tree-Based State Tying for High Accuracy Modelling , 1994, HLT.

[4]  J. Foote,et al.  WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .

[5]  Re. Techniques for Information Retrieval from Speech Messages , 1991 .

[6]  Steve Renals,et al.  WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Vijay Balasubramanian,et al.  Speech-Based Retrieval Using Semantic Co-Occurrence Filtering , 1994, HLT.

[8]  Herbert Gish,et al.  Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[9]  Karen Spärck Jones,et al.  Talker-independent keyword spotting for information retrieval , 1995, EUROSPEECH.

[10]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[11]  Janet M. Baker,et al.  Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[13]  Karen Spärck Jones,et al.  Experiments in Spoken Document Retrieval , 1996, Inf. Process. Manag..