Video Manga: generating semantically meaningful video summaries

This paper presents methods for automatically creating pictorial video summaries that resemble comic books. The relative importance of video segments is computed from their length and novelty. Image and audio analysis is used to automatically detect and emphasize meaningful events. Based on this importance measure, we choose relevant keyframes. Selected keyframes are sized by importance, and then efficiently packed into a pictorial summary. We present a quantitative measure of how well a summary captures the salient events in a video, and show how it can be used to improve our summaries. The result is a compact and visually pleasing summary that captures semantically important events, and is suitable for printing or Web access. Such a summary can be further enhanced by including text captions derived from OCR or other methods. We describe how the automatically generated summaries are used to simplify access to a large collection of videos.

[1]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Boon-Lock Yeo,et al.  Classification, simplification, and dynamic visualization of scene transition graphs for video browsing , 1997, Electronic Imaging.

[4]  Remi Depommier,et al.  Content-based browsing of video sequences , 1994, MULTIMEDIA '94.

[5]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[6]  David C. Gibbon,et al.  Automated authoring of hypermedia documents of video programs , 1995, MULTIMEDIA '95.

[7]  Michael G. Christel,et al.  Evolving video skims into useful multimedia abstractions , 1998, CHI.

[8]  John S. Boreczky,et al.  Finding presentations in recorded meetings using audio and video features , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  Edie M. Rasmussen,et al.  Clustering Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[10]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[11]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[12]  Yukinobu Taniguchi,et al.  PanoramaExcerpts: extracting and packing panoramas for video browsing , 1997, MULTIMEDIA '97.

[13]  John S. Boreczky,et al.  Comparison of video shot boundary detection techniques , 1996, J. Electronic Imaging.

[14]  Robert Malkin,et al.  Experiments in automatic meeting transcription using JRTK , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[15]  Wolfgang Effelsberg,et al.  Abstracting Digital Movies Automatically , 1996, J. Vis. Commun. Image Represent..

[16]  Boon-Lock Yeo,et al.  Video browsing using clustering and scene transitions on compressed sequences , 1995, Electronic Imaging.

[17]  Philippe Aigrain,et al.  Medium knowledge-based macro-segmentation of video into sequences , 1997 .

[18]  Qian Huang,et al.  Automated semantic structure reconstruction and representation generation for broadcast news , 1998, Electronic Imaging.

[19]  Andreas Girgensohn,et al.  An intelligent media browser using automatic multimodal analysis , 1998, MULTIMEDIA '98.

[20]  Shingo Uchihashi,et al.  Summarizing video using a shot importance measure and a frame-packing algorithm , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[21]  Andreas Girgensohn,et al.  Video classification using transform coefficients , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[22]  Andreas Girgensohn,et al.  Time-Constrained Keyframe Selection Technique , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.