Integrated multimedia processing for topic segmentation and classification

We describe integrated multimedia processing for Video Scout, a system that segments and indexes TV programs according to their audio, visual, and transcript information. Video Scout represents a future direction for personal video recorders. In addition to using electronic program guide metadata and a user profile, Scout allows the users to request specific topics within a program. For example, users can request the video clip of the USA president speaking from a half-hour news program. Video Scout has three modules: (i) video pre-processing, (ii) segmentation and indexing, and (iii) storage and user interface. Segmentation and indexing, the core of the system, incorporates a Bayesian framework that integrates information from the audio, visual, and transcript (closed captions) domains. This framework uses three layers to process low, mid, and high-level multimedia information. The high-level layer generates semantic information about TV program topics. This paper describes the elements of the system and presents results from running Video Scout on real TV programs.

[1]  Milind R. Naphade,et al.  A probabilistic framework for semantic indexing and retrieval in video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[2]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[3]  Nevenka Dimitrova,et al.  Multimedia Content Analysis and Indexing for Filtering and Retrieval Applications , 1999, Informing Sci. Int. J. an Emerg. Transdiscipl..

[4]  Nuno Vasconcelos,et al.  Bayesian representations and learning mechanisms for content-based image retrieval , 1999, Electronic Imaging.

[5]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[6]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[7]  Serhan Dagtas,et al.  Selective video content analysis and filtering , 1999, Electronic Imaging.

[8]  Gang Wei,et al.  TV program classification based on face and text processing , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[9]  Tanveer F. Syeda-Mahmood,et al.  Detecting topical events in digital video , 2000, ACM Multimedia.