Towards Affordable Disclosure of Spoken Heritage Archives

This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken heritage archives in general, and in particular of a collection of recorded interviews with Dutch survivors of World War II concentration camp Buchenwald. Given such collections, we at least want to provide search at different levels and a flexible way of presenting results. Strategies for automatic annotation based on speech recognition - supporting e.g., within-document search - are outlined and discussed with respect to the Buchenwald interview collection. In addition, usability aspects of the spoken word search are discussed on the basis of our experiences with the online Buchenwald web portal. It is concluded that, although user feedback is generally fairly positive, automatic annotation performance is not yet satisfactory, and requires additional research.

[1]  Franciska de Jong,et al.  TwNC: a Multifaceted Dutch News Corpus , 2007 .

[2]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[3]  Jonathan G. Fiscus,et al.  Tools for the analysis of benchmark speech recognition tests , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  Xavier Anguera Miró,et al.  Robust speaker diarization for meetings: ICSI RT06s evaluation system , 2006, INTERSPEECH.

[5]  Julia Hirschberg,et al.  ASR satisficing: the effects of ASR accuracy on speech retrieval , 2000, INTERSPEECH.

[6]  Richard Wright,et al.  Accessing the spoken word , 2005, International Journal on Digital Libraries.

[7]  Franciska de Jong,et al.  Radio Oranje: searching the queen's speech(es) , 2007, SIGIR.

[8]  Bernard J. Jansen,et al.  Search log analysis: What it is, what's been done, how to do it , 2006 .

[9]  Djoerd Hiemstra,et al.  PFTijah: text search in an XML database system , 2006 .

[10]  Roeland Ordelman,et al.  Exploration of audiovisual heritage using audio indexing technology , 2006 .

[11]  Natalie Liberman,et al.  Recognition of elderly speech and voice-driven document retrieval , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[13]  Nelleke Oostdijk,et al.  The Spoken Dutch Corpus. Overview and First Evaluation , 2000, LREC.

[14]  David A. van Leeuwen,et al.  N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology , 2007, INTERSPEECH.

[15]  Jakob Nielsen,et al.  Heuristic evaluation of user interfaces , 1990, CHI '90.

[16]  Douglas A. Reynolds,et al.  An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Franciska de Jong,et al.  Disclosing spoken culture: user interfaces for access to spoken word archives , 2008, BCS HCI.

[18]  Djoerd Hiemstra,et al.  Towards Affordable Disclosure of Spoken Word Archives , 2008 .

[19]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[20]  Franciska de Jong,et al.  SHoUT, the university of twente submission to the n-best 2008 speech recognition evaluation for dutch , 2009, INTERSPEECH.

[21]  John R. Kender,et al.  Analysis and visualization of index words from audio transcripts of instructional videos , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[22]  Marijn Huijbregts,et al.  Segmentation, diarization and speech transcription : surprise data unraveled , 2008 .

[23]  Roeland Ordelman,et al.  Filtering the unknown: speech activity detection in heterogeneous video collections , 2007, INTERSPEECH.

[24]  Franciska de Jong,et al.  Automated Speech and Audio Analysis for Semantic Access to Multimedia , 2006, SAMT.

[25]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[26]  Roeland Ordelman,et al.  Dutch speech recognition in multimedia information retrieval , 2003 .