New techniques for open-vocabulary spoken document retrieval

This paper presents four novel techniques for open-vocabulary spoken document retrieval: a method to detect slots that possibly contain a query feature; a method to estimate occurrence probabilities; a technique that we call collection-wide probability re-estimation and a weighting scheme which takes advantage of the fact that long query features are detected more reliably. These four techniques have been evaluated using the TREC-6 spoken document retrieval test collection to determine the improvements in retrieval e ectiveness with respect to a baseline retrieval method. Results show that the retrieval e ectiveness can be improved considerably despite the large number of speech recognition errors.

[1]  Raj Reddy,et al.  Automatic Speech Recognition: The Development of the Sphinx Recognition System , 1988 .

[2]  Peter Schäuble,et al.  A system for retrieving speech documents , 1992, SIGIR '92.

[3]  James Allan,et al.  Automatic Routing and Ad-hoc Retrieval Using SMART: TREC 2 , 1993, TREC.

[4]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[5]  Steve Young,et al.  HTK V1.5: User, Reference and Programmer Manuals , 1993 .

[6]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[7]  Anthony J. Robinson,et al.  An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[8]  Karen Spärck Jones,et al.  Video Mail Retrieval Using Voice: An Overview of the Stage 2 System , 1995, MIRO.

[9]  Peter Schäuble,et al.  Speech Retrieval Based on Automatic Indexing , 1995, MIRO.

[10]  Peter Schäuble,et al.  Applying probabilistic term weighting to OCR text in the case of a large alphabetic library catalogue , 1995, SIGIR '95.

[11]  Gerard Salton,et al.  Automatic Routing and Retrieval Using Smart: TREC-2 , 1995, Inf. Process. Manag..

[12]  David A. James,et al.  A system for unrestricted topic retrieval from radio news broadcasts , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[13]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[14]  Karen Spärck Jones,et al.  Retrieving spoken documents by combining multiple index sources , 1996, SIGIR '96.

[15]  Ellen M. Voorhees,et al.  The TREC-6 Spoken Document Retrieval Track , 2005 .

[16]  Karen Spärck Jones,et al.  Open-vocabulary speech indexing for voice and video mail retrieval , 1997, MULTIMEDIA '96.

[17]  Steve Renals,et al.  The THISL Spoken Document Retrieval System , 1998, TREC.

[18]  James Allan,et al.  INQUERY Does Battle With TREC-6 , 1997, TREC.

[19]  Peter Schäuble,et al.  Cross-language speech retrieval: establishing a baseline performance , 1997, SIGIR '97.

[20]  Victor Zue,et al.  Subword unit representations for spoken document retrieval , 1997, EUROSPEECH.

[21]  Peter Schäuble,et al.  ETH TREC-6: Routing, Chinese, Cross-Language and Spoken Document Retrieval , 1997, TREC.

[22]  Alexander G. Hauptmann,et al.  SPEECH RECOGNITION AND INFORMATION RETRIEVAL: EXPERIMENTS IN RETRIEVING SPOKEN DOCUMENTS , 1997 .

[23]  Martin Wechsler,et al.  Spoken document retrieval based on phoneme recognition , 1998 .

[24]  Howard D. Wactlar,et al.  INFORMEDIATM: NEWS-ON-DEMAND EXPERIMENTS IN SPEECH RECOGNITION , 1998 .

[25]  Salim Roukos,et al.  Audio-Indexing For Broadcast News , 1998, TREC.

[26]  Elke Mittendorf Data corruption and information retrieval , 1998 .

[27]  Steve Renals The THISL spoken document retrieval project , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.