Spoken Document Retrieval, Automatic

Spoken document retrieval is defined as information retrieval from transcribed spoken audio, and the basic approach is described. Research experiments demonstrate that information retrieval from transcribed speech can be done almost as effectively as from corresponding correct text. The retrieval performance is relatively immune to moderate amounts of speech recognition errors in the transcripts. Retrieval strategies using query expansion are valuable in mitigating effects of speech recognition errors. These can be supplemented by exploiting information that may be available from large parallel text corpora through automatic document and query augmentation. There is ongoing research in combining multilingual retrieval with spoken document retrieval and video retrieval, which requires spoken document retrieval together with video or image analysis. Some examples of successful applications of spoken document technology in practice are presented.

[1]  Ralph Weischedel,et al.  NAMED ENTITY EXTRACTION FROM SPEECH , 1998 .

[2]  Steve Renals,et al.  The THISL broadcast news retrieval system. , 1999 .

[3]  Karen Sparck Jones,et al.  Spoken Document Retrieval for TREC-8 at Cambridge University , 1998, TREC.

[4]  Ross Wilkinson,et al.  Experiments in spoken document retrieval using phoneme n-grams , 2000, Speech Commun..

[5]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[6]  Jianqiang Wang,et al.  Mandarin-English Information (MEI): investigating translingual speech retrieval , 2004, Comput. Speech Lang..

[7]  Howard D. Wactlar,et al.  Indexing and search of multimodal information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Philip C. Woodland,et al.  A method for direct audio search with applications to indexing and retrieval , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[10]  Philip C. Woodland,et al.  The Cambridge Multimedia Document Retrieval Project: summary of experiments , 2001 .

[11]  Ronald Rosenfeld,et al.  Optimizing lexical and N-gram coverage via judicious use of linguistic data , 1995, EUROSPEECH.

[12]  Beth Logan,et al.  News Tuner: a simple interface for searching and browsing radio archives , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[13]  Richard M. Stern,et al.  Integration of continuous speech recognition and information retrieval for mutually optimal performance , 1999 .

[14]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[15]  Karen Spärck Jones,et al.  Effects of out of vocabulary words in spoken document retrieval (poster session) , 2000, SIGIR '00.

[16]  Pak-Chung Ching,et al.  Multi-scale audio indexing for Chinese spoken document retrieval , 2000, INTERSPEECH.

[17]  Gareth J. F. Jones,et al.  CLEF 2004 Cross-Language Spoken Document Retrieval Track , 2004, CLEF.

[18]  Beth Logan,et al.  Speechbot: an experimental speech-based search engine for multimedia content on the web , 2002, IEEE Trans. Multim..