Video summarization: methods and landscape

The ability to summarize and abstract information will be an essential part of intelligent behavior in consumer devices. Various summarization methods have been the topic of intensive research in the content-based video analysis community. Summarization in traditional information retrieval is a well understood problem. While there has been a lot of research in the multimedia community there is no agreed upon terminology and classification of the problems in this domain. Although the problem has been researched from different aspects there is usually no distinction between the various dimensions of summarization. The goal of the paper is to provide the basic definitions of widely used terms such as skimming, summarization, and highlighting. The different levels of summarization: local, global, and meta-level are made explicit. We distinguish among the dimensions of task, content, and method and provide an extensive classification model for the same. We map the existing summary extraction approaches in the literature into this model and we classify the aspects of proposed systems in the literature. In addition, we outline the evaluation methods and provide a brief survey. Finally we propose future research directions based on the white spots that we identified by analysis of existing systems in the literature.

[1]  Bernard Mérialdo,et al.  Automatic construction of personalized TV news programs , 1999, MULTIMEDIA '99.

[2]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[3]  Shih-Ping Liou,et al.  Videoabstract: a hybrid approach to generate semantically meaningful video summaries , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[4]  Chitra Dorai,et al.  MPEG-7 Videotext description scheme for superimposed text in images and video , 2000, Signal Process. Image Commun..

[5]  Daniel M. Russell A design pattern-based video summarization technique: moving from low-level signals to high-level structure , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[6]  Shih-Fu Chang,et al.  A utility framework for the automatic generation of audio-visual skims , 2002, MULTIMEDIA '02.

[7]  Jeho Nam,et al.  Video abstract of video , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[8]  David S. Doermann,et al.  Video summarization by curve simplification , 1998, MULTIMEDIA '98.

[9]  Anoop Gupta,et al.  Auto-summarization of audio-video presentations , 1999, MULTIMEDIA '99.

[10]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[11]  Nevenka Dimitrova,et al.  Color superhistograms for video representation , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[12]  John R. Kender,et al.  Video Summaries through Mosaic-Based Shot and Scene Clustering , 2002, ECCV.

[13]  John Zimmerman,et al.  Content Augmentation Aspects of Personalized Entertainment Experience , 2003 .

[14]  Alan Hanjalic,et al.  An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..

[15]  Nevenka Dimitrova,et al.  Video keyframe extraction and filtering: a keyframe is not a keyframe to everyone , 1997, CIKM '97.

[16]  John Zimmerman,et al.  Study on requirement specifications for personalized multimedia summarization , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[17]  Shih-Fu Chang,et al.  Real-time personalized sports video filtering and summarization , 2001, MULTIMEDIA '01.

[18]  Lalitha Agnihotri,et al.  Summarization of video programs based on closed captions , 2000, IS&T/SPIE Electronic Imaging.

[19]  John Zimmerman,et al.  Video scouting: an architecture and system for the integration of multimedia information in personal TV applications , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[20]  Tobun Dorbin Ng,et al.  Collages as dynamic summaries for news video , 2002, MULTIMEDIA '02.

[21]  Bernard Mérialdo,et al.  Generating summaries of multi-episode video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[22]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[23]  Alexander G. Hauptmann,et al.  Text, Speech, and Vision for Video Segmentation: The InformediaTM Project , 1995 .

[24]  Gary Marchionini,et al.  Multimodal surrogates for video browsing , 1999, DL '99.

[25]  Xin Liu,et al.  Summarizing video by minimizing visual content redundancies , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[26]  Bernard Mérialdo,et al.  Using content models to build audio-video summaries , 1998, Electronic Imaging.

[27]  Tanveer F. Syeda-Mahmood,et al.  Learning video browsing behavior and its application in the generation of video previews , 2001, MULTIMEDIA '01.

[28]  Serhan Dagtas,et al.  SmartWatch: an automated video event finder , 2000, MM 2000.

[29]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[30]  Wolfgang Effelsberg,et al.  Robust clustering-based video-summarization with integration of domain-knowledge , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[31]  Milan Petkovic,et al.  Multi-modal extraction of highlights from TV Formula 1 programs , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[32]  Kiyoharu Aizawa,et al.  Summarizing wearable video , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[33]  Rainer Lienhart Dynamic video summarization of home video , 1999, Electronic Imaging.

[34]  Andreas Girgensohn,et al.  A genetic algorithm for video segmentation and summarization , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[35]  Ying Li,et al.  Semantic video content abstraction based on multiple cues , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[36]  Noboru Babaguchi,et al.  Generation of personalized abstract of sports video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[37]  Benoit Huet,et al.  Generating TV summaries for CE-devices , 2002, MULTIMEDIA '02.

[38]  Wolfgang Effelsberg,et al.  Abstracting Digital Movies Automatically , 1996, J. Vis. Commun. Image Represent..

[39]  Avideh Zakhor,et al.  Applications of Video-Content Analysis and Retrieval , 2002, IEEE Multim..

[40]  Ajay Divakaran,et al.  Constant pace skimming and temporal sub-sampling of video using motion activity , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[41]  Anoop Gupta,et al.  Time-compression: systems concerns, usage, and benefits , 1999, CHI '99.