The THISL broadcast news retrieval system.

This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer, using a recurrent network acoustic model, and a probabilistic text retrieval system. We discuss the development of a realtime British English Broadcast News system, and its integration into a spoken document retrieval system. Detailed evaluation is performed using a similar North American Broadcast News system, to take advantage of the TREC SDR evaluation methodology. We report results on this evaluation, with particular reference to the effect of query expansion and of automatic segmentation algorithms.

[1]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[2]  Steve Renals,et al.  Start-synchronous search for large vocabulary continuous speech recognition , 1999, IEEE Trans. Speech Audio Process..

[3]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[4]  Larry Gillick,et al.  A hidden Markov model approach to text segmentation and event tracking , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Steve Renals,et al.  Confidence measures derived from an acceptor HMM , 1998, ICSLP.

[6]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[7]  Brian Kingsbury,et al.  An Overview of the SPRACH System for the Transcription of Broadcast News , 1999 .

[8]  James Allan,et al.  INQUERY Does Battle With TREC-6 , 1997, TREC.

[9]  Justin Zobel,et al.  Passage retrieval revisited , 1997, SIGIR '97.

[10]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[11]  Tony Robinson,et al.  Time-first search for large vocabulary speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Karen Spärck Jones,et al.  The Cambridge University spoken document retrieval system , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).