Evaluation of Spoken Document Retrieval for Historic Speech Collections

The re-use of spoken word audio collections maintained by audiovisual archives is severely hindered by their generally limited access. The CHoral project, which is part of the CATCH program funded by the Dutch Research Council, aims to provide users of speech archives with online, instead of on-location, access to relevant fragments, instead of full documents. To meet this goal, a spoken document retrieval framework is being developed. In this paper the evaluation efforts undertaken so far to assess and improve various aspects of the framework are presented. These efforts include (i) evaluation of the automatically generated textual representations of the spoken word documents that enable word-based search, (ii) the development of measures to estimate the quality of the textual representations for use in information retrieval, and (iii) studies to establish the potential user groups of the to-be-developed technology, and the first versions of the user interface supporting online access to spoken word collections.

[1]  Douglas A. Reynolds,et al.  An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[3]  Elaine Toms,et al.  The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives , 2006, CHI.

[4]  Howard D. Wactlar,et al.  Facilitating access to large digital oral history archives through informedia technologies , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[5]  K. Sparck Jones,et al.  A Probabilistic Model of Information Retrieval : Development and Status , 1998 .

[6]  Willemijn Heeren,et al.  Evaluating ASR Output for Information Retrieval , 2007, SIGIR 2007.

[7]  Slava Kalyuga,et al.  Managing split-attention and redundancy in multimedia instruction , 1999 .

[8]  Richard Wright,et al.  Accessing the spoken word , 2005, International Journal on Digital Libraries.

[9]  Franciska de Jong,et al.  A Spoken Document Retrieval Application in the Oral History Domain , 2005 .

[10]  Julia Hirschberg,et al.  SCAN: designing and evaluating user interfaces to support retrieval from speech archives , 1999, SIGIR '99.

[11]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[12]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[13]  Franciska de Jong,et al.  Radio Oranje: Enhanced Access to a Historical Spoken Word Collection , 2007, CLIN 2007.

[14]  Roeland Ordelman,et al.  Filtering the unknown: speech activity detection in heterogeneous video collections , 2007, INTERSPEECH.

[15]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[16]  Jonathan G. Fiscus,et al.  1998 Broadcast News Benchmark Test Results: English and Non-English Word Error Rate Performance Measures , 1998 .