Real-world audio indexing systems

We motivate the need for and describe the key components of real world audio indexing systems. In particular, we discuss the various flavors of such systems, the advantages and disadvantages of each, user interfaces, system architectures and evaluation issues. Throughout the paper, we give examples from our own experience of audio indexing using SpeechBot and its successor NewsTuner.

[1]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[2]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[3]  David A. James,et al.  A system for unrestricted topic retrieval from radio news broadcasts , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4]  Karen Spärck Jones,et al.  Retrieving spoken documents by combining multiple index sources , 1996, SIGIR '96.

[5]  Michael J. Witbrock,et al.  Using words and phonetic strings for efficient information retrieval from imperfectly transcribed spoken documents , 1997, DL '97.

[6]  Howard D. Wactlar,et al.  INFORMEDIATM: NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION , 1998 .

[7]  Thomas Niesler,et al.  Experiments in broadcast news transcription , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  R. E. Jones,et al.  EXPERIMENTS IN INFORMATION RETRIEVAL FROM SPOKEN DOCUMENTS , 1998 .

[9]  Michael J. Witbrock,et al.  Informedia News-On Demand: Using Speech Recognition to Create a Digital Video Library , 1998 .

[10]  Steve Renals,et al.  Retrieval of broadcast news documents with the THISL system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Ellen M. Voorhees,et al.  1998 TREC-7 Spoken Document Retrieval Track Overview and Results , 1998 .

[12]  Karen Spärck Jones,et al.  The Cambridge University spoken document retrieval system , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[13]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[14]  Beth Logan,et al.  An experimental study of an audio indexing system for the web , 2000, INTERSPEECH.

[15]  Michael J. Swain,et al.  SpeechBot: a Speech Recognition based Audio Indexing System for the Web , 2000, RIAO.

[16]  Karen Spärck Jones,et al.  Effects of out of vocabulary words in spoken document retrieval (poster session) , 2000, SIGIR '00.

[17]  Kenney Ng Information fusion for spoken document retrieval , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[18]  David M. Blei,et al.  Topic segmentation with an aspect hidden Markov model , 2001, SIGIR '01.

[19]  Beth Logan,et al.  Word and sub-word indexing approaches for reducing the effects of OOV queries on spoken audio , 2002 .

[20]  Beth Logan,et al.  Confusion-based query expansion for OOV words in spoken document retrieval , 2002, INTERSPEECH.

[21]  Salim Roukos,et al.  A multistage algorithm for spotting new words in speech , 2002, IEEE Trans. Speech Audio Process..

[22]  Mark A. Clements,et al.  Phonetic Searching vs. LVCSR: How to Find What You Really Want in Audio Archives , 2002, Int. J. Speech Technol..

[23]  Chris Weikart,et al.  Multimedia content analysis and indexing: evaluation of a distributed and scalable architecture , 2003, SPIE ITCom.

[24]  Beth Logan Fusion of Semantic and Acoustic Approaches for Spoken Document Retrieval , 2003 .

[25]  Beth Logan,et al.  News Tuner: a simple interface for searching and browsing radio archives , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).