Indexing, Browsing, and Searching of Digital Video and Digital Audio Information

In this chapter we examine various techniques for providing content access to information stored in a continuous medium, namely digital audio and digital video. Our coverage of audio is centered around post-processing the output of automatic recognition of speech or phones and we describe the various approaches that have been taken in this area. In order to give reasonable coverage of the possibilities and limitations of content-based access to digital video information we sketch out at a high level, the approaches taken in various video compression algorithms, principally the MPEG family.. We then address approaches to shot and scene boundary detection, choosing representative frames for browsing and for search, and various browsing interfaces that have been developed. We finish with an overview of the likely developments in this area in the future.

[1]  Azriel Rosenfeld,et al.  Content-Based Access to Multimedia Information: From Technology Trends to State of the Art , 1999 .

[2]  John F. Koegel Buford Multimedia systems , 1994 .

[3]  J. Stephen Downie,et al.  Evaluation of a simple and effective music information retrieval method , 2000, SIGIR '00.

[4]  Alan F. Smeaton,et al.  Implementation and Analysis of Several Keyframe-Based Browsing Interfaces to Digital Video , 2000, ECDL.

[5]  Peter Schäuble Multimedia Information Retrieval , 1997 .

[6]  Ellen M. Voorhees,et al.  The seventh text REtrieval conference (TREC-7) , 1999 .

[7]  Alan F. Smeaton,et al.  An evaluation of alternative techniques for automatic detection of shot boundaries in digital video , 1999 .

[8]  Gerard. Quinn Alan. Smeaton Optimal Parameters For Segmenting A Stream Of Audio Into Speech Documents , 1999 .

[9]  Alan F. Smeaton,et al.  Taiscéalaí: Information Retrieval from an Archive of Spoken Radio News , 1998, ECDL.

[10]  Rick Kazman,et al.  Supporting the retrieval process in multimedia information systems , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[11]  Robert S. Tannenbaum,et al.  Theoretical foundations of multimedia , 1998, UBIQ.

[12]  Noel E. O'Connor,et al.  Evaluating and combining digital video shot boundary detection algorithms , 2000 .

[13]  Alexander I. Rudnicky,et al.  Survey of current speech technology , 1994, CACM.

[14]  Tim Morris BSc Multimedia Systems , 2000, Applied Computing.

[15]  Azriel Rosenfeld,et al.  Content-Based Access to Multimedia Information , 1999 .

[16]  Thomas Sikora,et al.  MPEG digital video-coding standards , 1997, IEEE Signal Process. Mag..

[17]  John P. Eakins,et al.  Retrieval of Still Images by Content , 2000, ESSIR.

[18]  Ellen M. Voorhees,et al.  1998 TREC-7 Spoken Document Retrieval Track Overview and Results , 1998 .

[19]  Karen Spärck Jones,et al.  Retrieving spoken documents by combining multiple index sources , 1996, SIGIR '96.

[20]  Alan F. Smeaton,et al.  The Fischlar Digital Video Recording, Analysis and Browsing System , 2000, RIAO.

[21]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[22]  Azriel Rosenfeld,et al.  State of The Art: Informedia Digital Video Library , 1999 .