论文信息 - Enhanced Multimedia Content Access and Exploitation Using Semantic Speech Retrieval

Enhanced Multimedia Content Access and Exploitation Using Semantic Speech Retrieval

Techniques for automatic annotation of spoken content making use of speech recognition technology have long been characterized as holding unrealized promise to provide access to archives inundated with undisclosed multimedia material. This paper provides an overview of techniques and trends in semantic speech retrieval, which is taken to encompass all approaches offering meaning-based access to spoken word collections. We present descriptions, examples and insights for current techniques, including facing real-world heterogenity, aligning parallel resources and exploiting collateral collections. We also discuss ways in which speech recognition technology can be used to create multimedia connections that make new modes of access available to users. We conclude with an overview of the challenges for semantic speech retrieval in the workflow of a real-world archive and perspectives on future tasks in which speech retrieval integrates information related to affect and appeal, dimensions that transcend topic.

Martha Larson | Franciska de Jong | Roeland Ordelman

[1] Lin-shan Lee,et al. Spoken document understanding and organization , 2005, IEEE Signal Processing Magazine.

[2] Martha Larson,et al. Overview of VideoCLEF 2008: Automatic Generation of Topic-based Feeds for Dual Language Audio-Visual Content , 2008, CLEF.

[3] M. de Rijke,et al. PodCred: a framework for analyzing podcast preference , 2008, WICOW '08.

[4] Stephen E. Robertson,et al. Okapi at TREC-4 , 1995, TREC.

[5] Richard M. Stern,et al. Integration of continuous speech recognition and information retrieval for mutually optimal performance , 1999 .

[6] Jonathan G. Fiscus,et al. 1998 Broadcast News Benchmark Test Results: English and Non-English Word Error Rate Performance Measures , 1998 .

[7] Roeland Ordelman,et al. Exploration of audiovisual heritage using audio indexing technology , 2006 .

[8] Nicu Sebe,et al. Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[9] Franciska de Jong,et al. Radio Oranje: searching the queen's speech(es) , 2007, SIGIR.

[10] Wessel Kraaij,et al. Content Reduction for Cross-media Browsing , 2005 .

[11] Véronique Malaisé,et al. Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs , 2009 .