Optimal multiscale organization of multimedia content for fast browsing and cost-effective transmission

In this paper, an interactive framework for efficient browsing and transmission of video sequences is presented, based on an optimal content-based video decomposition scheme. In particular each video file is analyzed to provide a multiscale structure of different "content resolution levels". This structure can be seen as a tree structure, each level of which corresponds to a particular content resolution, while the tree-nodes contain viewing elements, representing the visual content of a segment of the sequence. The multiscale optimal video organization is performed by minimizing a cross correlation criterion so that the most representative shots (key-shots) from a video sequence or frames (key-frames) from a video shot are extracted. Experimental results on real-life video sequences show that the proposed multiscale video organization technique enables users to detect content of interest much faster, compared to the conventional sequential video scanning method, and thus it leads to significant reduction of the viewed/transmitted information.

[1]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[2]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[3]  Alan Hanjalic,et al.  An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..

[4]  John R. Smith,et al.  Adaptive storage and retrieval of large compressed images , 1998, Electronic Imaging.

[5]  Stefanos D. Kollias,et al.  A fuzzy video content representation for video summarization and content-based retrieval , 2000, Signal Process..

[6]  Nuno Vasconcelos,et al.  A spatiotemporal motion model for video summarization , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[7]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[8]  Stefanos D. Kollias,et al.  Efficient summarization of stereoscopic video sequences , 2000, IEEE Trans. Circuits Syst. Video Technol..

[9]  Jing Xiao,et al.  Content-Based Video Indexing and Retrieval , 2004 .

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Remi Depommier,et al.  Content-based browsing of video sequences , 1994, MULTIMEDIA '94.

[12]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[13]  John R. Smith,et al.  VideoZoom Spatio-Temporal Video Browser , 1999, IEEE Trans. Multim..

[14]  Stefanos D. Kollias,et al.  Non-sequential video content representation using temporal variation of feature vectors , 2000, 2000 Digest of Technical Papers. International Conference on Consumer Electronics. Nineteenth in the Series (Cat. No.00CH37102).

[15]  Charles L. Compton,et al.  Internet CNN NEWSROOM: a digital video news magazine and library , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[16]  Michael Mills,et al.  A magnifier tool for video data , 1992, CHI.