Exploring video content structure for hierarchical summarization

Abstract.In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the supergroup into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

[1]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[2]  Xin Liu,et al.  Generating optimal video summaries , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[3]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[4]  T. Kanade,et al.  A multi-body factorization method for motion analysis , 1995, ICCV 1995.

[5]  Jianping Fan,et al.  Automatic model-based semantic object extraction algorithm , 2001, IEEE Trans. Circuits Syst. Video Technol..

[6]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Jianping Fan,et al.  MultiView: Multilevel video content representation and retrieval , 2001, J. Electronic Imaging.

[9]  Jenq-Neng Hwang,et al.  An integrated scheme for object-based video abstraction , 2000, ACM Multimedia.

[10]  Stefanos D. Kollias,et al.  A fuzzy video content representation for video summarization and content-based retrieval , 2000, Signal Process..

[11]  Jianping Fan,et al.  Spatiotemporal segmentation for compact video representation , 2001, Signal Process. Image Commun..

[12]  Jing Xiao,et al.  Content-Based Video Indexing and Retrieval , 2004 .

[13]  Shih-Ping Liou,et al.  Videoabstract: a hybrid approach to generate semantically meaningful video summaries , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[14]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[15]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[16]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[17]  Stefanos D. Kollias,et al.  Efficient summarization of stereoscopic video sequences , 2000, IEEE Trans. Circuits Syst. Video Technol..

[18]  Edie M. Rasmussen,et al.  Clustering Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[19]  Jianping Fan,et al.  Automatic Scene Detection in News Program by Integrating Visual Feature and Rules , 2001, IEEE Pacific Rim Conference on Multimedia.

[20]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  Thomas S. Huang,et al.  Constructing table-of-content for videos , 1999, Multimedia Systems.

[22]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[23]  C.-C. Jay Kuo,et al.  Rule-based video classification system for basketball video indexing , 2000, MULTIMEDIA '00.

[24]  Andreas Girgensohn,et al.  Time-Constrained Keyframe Selection Technique , 2004, Multimedia Tools and Applications.

[25]  Boon-Lock Yeo,et al.  Time-constrained clustering for segmentation of video into story units , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[26]  Michael Mills,et al.  A magnifier tool for video data , 1992, CHI.

[27]  M. Smith,et al.  Video Skimming for Quick Browsing based on Audio and Image Characterization , 1995 .

[28]  Rainer Lienhart,et al.  Abstracting home video automatically , 1999, MULTIMEDIA '99.

[29]  Jianping Fan,et al.  Hierarchical video summarization for medical data , 2001, IS&T/SPIE Electronic Imaging.

[30]  Anthony Stefanidis,et al.  Summarizing video datasets in the spatiotemporal domain , 2000, Proceedings 11th International Workshop on Database and Expert Systems Applications.

[31]  Nuno Vasconcelos,et al.  A spatiotemporal motion model for video summarization , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[32]  Wolfgang Effelsberg,et al.  Video abstracting , 1997, CACM.

[33]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[34]  David S. Doermann,et al.  Video summarization by curve simplification , 1998, MULTIMEDIA '98.

[35]  Andreas Dieberger,et al.  Hierarchical brushing in a collection of video data , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[36]  Anoop Gupta,et al.  Auto-summarization of audio-video presentations , 1999, MULTIMEDIA '99.

[37]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[38]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[39]  Yukinobu Taniguchi,et al.  PanoramaExcerpts: extracting and packing panoramas for video browsing , 1997, MULTIMEDIA '97.

[40]  M. Ibrahim Sezan,et al.  Detecting hunts in wildlife videos , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[41]  Wolfgang Effelsberg,et al.  Abstracting Digital Movies Automatically , 1996, J. Vis. Commun. Image Represent..

[42]  Jianping Fan,et al.  Medical video mining for efficient database indexing, management and access , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[43]  HongJiang Zhang,et al.  Automatic video scene extraction by shot grouping , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[44]  M. Ibrahim Sezan,et al.  Hierarchical video summarization , 1998, Electronic Imaging.

[45]  Shahram Ebadollahi,et al.  Echocardiogram video summarization , 2001, SPIE Medical Imaging.

[46]  John R. Kender,et al.  Video scene segmentation via continuous video coherence , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[47]  Michael G. Christel Visual digests for news video libraries , 1999, MULTIMEDIA '99.

[48]  Guy L. Scott,et al.  Feature grouping by 'relocalisation' of eigenvectors of the proximity matrix , 1990, BMVC.

[49]  Shih-Fu Chang,et al.  Determining computable scenes in films and their structures using audio-visual memory models , 2000, ACM Multimedia.

[50]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[51]  Alexander G. Hauptmann,et al.  Adjustable filmstrips and skims as abstractions for a digital video library , 1999, Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries.

[52]  Jeho Nam,et al.  Dynamic video summarization and visualization , 1999, MULTIMEDIA '99.

[53]  Alan Hanjalic,et al.  An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..

[54]  Eve M. Schooler The impact of scaling on a multimedia connection architecture , 2005, Multimedia Systems.