A Qualitative Study Towards Using Large Vocabulary Automatic Speech Recognition to Index Recorded Presentations for Search and Access over the Web

Recording lectures and putting them on the Web for access by students has become a general trend at various universities. To take full gain of the knowledge database that is built by these documents elaborate search functionality has to be provided that goes beyond search on meta-data level but performs a detailed analysis of the corresponding multimedia documents. In this paper, we present some experiments we did towards setting up a Web-based search engine for audio recordings of presentations. We evaluate standard, state-of-the-art speech recognition software as well as achievable retrieval performance. In addition, we compare the speech retrieval results with a traditional, text-based approach for searching to evaluate the value of speech processing for lecture retrieval.

[1]  Gordon Bell,et al.  Noncollaborative telepresentations come of age , 1997, CACM.

[2]  Holger Horz,et al.  Lecture recording and its use in a traditional university course , 2002, ITiCSE '02.

[3]  Howard D. Wactlar,et al.  Indexing and search of multimodal information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Gregory D. Abowd,et al.  Supporting educational activities through dynamic web interfaces , 2001, Interact. Comput..

[5]  Thomas Schaaf,et al.  Lecture and presentation tracking in an intelligent meeting room , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[6]  Beth Logan,et al.  Speechbot: an experimental speech-based search engine for multimedia content on the web , 2002, IEEE Trans. Multim..

[7]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[8]  Brian Christopher Smith,et al.  Passive capture and structuring of lectures , 1999, MULTIMEDIA '99.

[9]  Leysia Palen,et al.  “I'll get that off the audio”: a case study of salvaging multimedia meeting records , 1997, CHI.

[10]  Gregory D. Abowd,et al.  Classroom 2000: An Experiment with the Instrumentation of a Living Educational Environment , 1999, IBM Syst. J..

[11]  Thomas Ottmann,et al.  The “Authoring on the Fly” system for automated recording and replay of (tele)presentations , 2000, Multimedia Systems.

[12]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.