Disclosing spoken culture: user interfaces for access to spoken word archives

Over the past century alone, millions of hours of audiovisual data have been collected with great potential for e.g., new creative productions, research and educational purposes. The actual (re-)use of these collections, however, is severely hindered by their generally limited access. In this paper a framework for improved access to spoken content from the cultural heritage domain is proposed, with a focus on online user interface designs that support access to speech archives. The evaluation of the user interface for an instantiation of the framework is presented, and future work for the adaptation of this first prototype to other collections and archives is proposed.

[1]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[2]  Douglas W. Oard,et al.  Searching large collections of recorded speech: A preliminary study , 2005, ASIST.

[3]  Michael G. Christel,et al.  The effect of text in storyboards for video navigation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  Franciska de Jong,et al.  Infolink: Analysis of Dutch Broadcast News and Cross-Media Browsing , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[5]  Franciska de Jong,et al.  A Spoken Document Retrieval Application in the Oral History Domain , 2005 .

[6]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[7]  Franciska de Jong,et al.  Radio Oranje: Enhanced Access to a Historical Spoken Word Collection , 2007, CLIN 2007.

[8]  R. Mayer,et al.  Nine Ways to Reduce Cognitive Load in Multimedia Learning , 2003 .

[9]  Scott R. Klemmer,et al.  Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories , 2002 .

[10]  Julia Hirschberg,et al.  SCAN: designing and evaluating user interfaces to support retrieval from speech archives , 1999, SIGIR '99.

[11]  Howard D. Wactlar,et al.  Facilitating access to large digital oral history archives through informedia technologies , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[12]  Franciska de Jong,et al.  Multimedia Search Without Visual Analysis: The Value of Linguistic and Contextual Information , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Thomas P. Moran,et al.  Speaker segmentation for browsing recorded audio , 1995, CHI 95 Conference Companion.

[14]  W. F. L. Heeren User requirements for access to Dutch spoken audio archives , 2008 .

[15]  Amanda Spink,et al.  Searching for multimedia: analysis of audio, video and image Web queries , 2000, World Wide Web.

[16]  Julia Hirschberg,et al.  ASR satisficing: the effects of ASR accuracy on speech retrieval , 2000, INTERSPEECH.

[17]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[18]  Douglas W. Oard,et al.  A graphical interface for speech-based retrieval , 1998, DL '98.

[19]  Douglas A. Reynolds,et al.  An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Bhuvana Ramabhadran,et al.  Supporting access to large digital oral history archives , 2002, JCDL '02.

[21]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[22]  Wolfgang Hürst User Interfaces for Speech-Based Retrieval of Lecture Recordings , 2004 .

[23]  Bhuvana Ramabhadran,et al.  Cross-Language Access to Recorded Speech in the MALACH Project , 2002, TSD.

[24]  Pedro J. Moreno,et al.  A recursive algorithm for the forced alignment of very long audio segments , 1998, ICSLP.

[25]  Franciska de Jong,et al.  Automated Speech and Audio Analysis for Semantic Access to Multimedia , 2006, SAMT.

[26]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[27]  Elaine Toms,et al.  The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives , 2006, CHI.

[28]  Adrian Mackenzie,et al.  ICT Tools for Searching, Annotation and Analysis of Audiovisual Media , 2006 .

[29]  R. Mayer,et al.  A Split-Attention Effect in Multimedia Learning: Evidence for Dual Processing Systems in Working Memory , 1998 .

[30]  John R. Kender,et al.  Analysis and visualization of index words from audio transcripts of instructional videos , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[31]  Mike Flynn,et al.  Browsing Recorded Meetings with Ferret , 2004, MLMI.

[32]  Slava Kalyuga,et al.  Managing split-attention and redundancy in multimedia instruction , 1999 .

[33]  Julia Hirschberg,et al.  What you see is (almost) what you hear: design principles for user interfaces for accessing speech archives , 1998, ICSLP.

[34]  Mark H. Chignell,et al.  Searching in audio: the utility of transcripts, dichotic presentation, and time-compression , 2006, CHI.