Disclosing spoken culture: user interfaces for access to spoken word archives

Over the past century alone, millions of hours of audiovisual data have been collected with great potential for e.g., new creative productions, research and educational purposes. The actual (re-)use of these collections, however, is severely hindered by their generally limited access. In this paper a framework for improved access to spoken content from the cultural heritage domain is proposed, with a focus on online user interface designs that support access to speech archives. The evaluation of the user interface for an instantiation of the framework is presented, and future work for the adaptation of this first prototype to other collections and archives is proposed.

[1]  Slava Kalyuga,et al.  Managing split-attention and redundancy in multimedia instruction , 1999 .

[2]  Julia Hirschberg,et al.  What you see is (almost) what you hear: design principles for user interfaces for accessing speech archives , 1998, ICSLP.

[3]  Wolfgang Hürst User Interfaces for Speech-Based Retrieval of Lecture Recordings , 2004 .

[4]  Franciska de Jong,et al.  Multimedia Search Without Visual Analysis: The Value of Linguistic and Contextual Information , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  John R. Kender,et al.  Analysis and visualization of index words from audio transcripts of instructional videos , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[6]  Mike Flynn,et al.  Browsing Recorded Meetings with Ferret , 2004, MLMI.

[7]  Franciska de Jong,et al.  Automated Speech and Audio Analysis for Semantic Access to Multimedia , 2006, SAMT.

[8]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[9]  John H. L. Hansen,et al.  SPEECHFIND: spoken document retrieval for a national gallery of the spoken word , 2004, Proceedings of the 6th Nordic Signal Processing Symposium, 2004. NORSIG 2004..

[10]  Elaine Toms,et al.  The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives , 2006, CHI.

[11]  Adrian Mackenzie,et al.  ICT Tools for Searching, Annotation and Analysis of Audiovisual Media , 2006 .

[12]  R. Mayer,et al.  A Split-Attention Effect in Multimedia Learning: Evidence for Dual Processing Systems in Working Memory , 1998 .

[13]  W. F. L. Heeren User requirements for access to Dutch spoken audio archives , 2008 .

[14]  Michael G. Christel,et al.  The effect of text in storyboards for video navigation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[15]  Franciska de Jong,et al.  A Spoken Document Retrieval Application in the Oral History Domain , 2005 .

[16]  Julia Hirschberg,et al.  SCAN: designing and evaluating user interfaces to support retrieval from speech archives , 1999, SIGIR '99.

[17]  Amanda Spink,et al.  Searching for multimedia: analysis of audio, video and image Web queries , 2000, World Wide Web.

[18]  Bhuvana Ramabhadran,et al.  Cross-Language Access to Recorded Speech in the MALACH Project , 2002, TSD.

[19]  Pedro J. Moreno,et al.  A recursive algorithm for the forced alignment of very long audio segments , 1998, ICSLP.

[20]  Richard Wright,et al.  Accessing the spoken word , 2005, International Journal on Digital Libraries.

[21]  Julia Hirschberg,et al.  ASR satisficing: the effects of ASR accuracy on speech retrieval , 2000, INTERSPEECH.

[22]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[23]  Douglas W. Oard,et al.  A graphical interface for speech-based retrieval , 1998, DL '98.

[24]  Scott R. Klemmer,et al.  Books with voices: paper transcripts as a physical interface to oral histories , 2003, CHI '03.

[25]  Thomas P. Moran,et al.  Speaker segmentation for browsing recorded audio , 1995, CHI 95 Conference Companion.

[26]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[27]  Franciska de Jong,et al.  Radio Oranje: Enhanced Access to a Historical Spoken Word Collection , 2007, CLIN 2007.

[28]  Douglas W. Oard,et al.  Searching large collections of recorded speech: A preliminary study , 2005, ASIST.

[29]  Bhuvana Ramabhadran,et al.  Supporting access to large digital oral history archives , 2002, JCDL '02.

[30]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[31]  Scott R. Klemmer,et al.  Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories , 2002 .

[32]  Mark H. Chignell,et al.  Searching in audio: the utility of transcripts, dichotic presentation, and time-compression , 2006, CHI.

[33]  Franciska de Jong,et al.  Infolink: Analysis of Dutch Broadcast News and Cross-Media Browsing , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[34]  Howard D. Wactlar,et al.  Facilitating access to large digital oral history archives through informedia technologies , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[35]  Douglas A. Reynolds,et al.  An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[37]  R. Mayer,et al.  Nine Ways to Reduce Cognitive Load in Multimedia Learning , 2003 .