Retrieving spoken documents by combining multiple index sources

This paper presents domain-independent methods of spoken document retrieval. Both a continuous-speech large vocabulary recognition system, and a phone-lattice word spotter, are used to locate index units within an experimental corpus of voice messages. Possible index terms are nearly unconstrained; terms not in a 20,000 word recognition system vocabulary can be identified by the word spotter at search time. Though either system alone can yield respectable retrieval performance, the two methods are complementary and work best in combination. Different ways of combining them are investigated, and it is shown that the best of these can increase retrieval average precision for a speakerindependent retrieval system to 85% of that achieved for full-text transcriptions of the test documents.

[1]  Trumpington Street,et al.  A FAST LATTICE-BASED APPROACH TO VOCABULARY INDEPENDENT WORDSPOTTING , 1994 .

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[4]  Steve Renals,et al.  IPA: improved phone modelling with recurrent neural networks , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[6]  Roger K. Moore,et al.  The application of dynamic programming techniques to non-word based topic spotting , 1995, EUROSPEECH.

[7]  S. Wray,et al.  The Medusa applications environment , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[8]  K. Sparck Jones,et al.  Video mail retrieval using voice : Report on topic spotting , 1997 .

[9]  J. Foote,et al.  WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .

[10]  Michael J. Carey,et al.  Topic discrimination using higher-order statistical models of spotted keywords , 1995, Comput. Speech Lang..

[11]  Herbert Gish,et al.  Approaches to topic identification on the switchboard corpus , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Michael G. Christel,et al.  Automating the creation of a digital video library , 1995, MULTIMEDIA '95.

[13]  Karen Spärck Jones,et al.  Talker-independent keyword spotting for information retrieval , 1995, EUROSPEECH.

[14]  David Anthony James,et al.  The Application of Classical Informa - tion Retrieval Techniques to Spoken Documents , 1995 .

[15]  Karen Spärck Jones,et al.  Experiments in Spoken Document Retrieval , 1996, Inf. Process. Manag..

[16]  S. J. Young,et al.  Tree-based state tying for high accuracy acoustic modelling , 1994 .

[17]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[18]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[19]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[20]  Karen Spärck Jones,et al.  Video mail retrieval: the effect of word spotting accuracy on precision , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[21]  Steve Renals,et al.  WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[22]  Herbert Gish,et al.  Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[23]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[24]  Re. Techniques for Information Retrieval from Speech Messages , 1991 .

[25]  Karen Spärck Jones,et al.  Robust talker-independent audio document retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[26]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[27]  Steve J. Young,et al.  Tree-Based State Tying for High Accuracy Modelling , 1994, HLT.

[28]  K. Sparck Jones,et al.  Video mail retrieval using voice: report on collection of naturalistic requests and relevance assessments , 1996 .