A Video Browsing Application based on visual MPEG-7 Descriptors and Self-Organising Maps

The paper introduces a novel approach for interactive video browsing that makes video content fully transparent to the user. Video clips are analysed and indexed by two tree structures: a content index tree representing the content of automatically segmented video shots and a time index tree representing the temporal structure. The index top levels give an overview over the entire content. Subsequent levels illustrate content relationships more detailed. Every level of both trees is a twodimensional self-organising map organising media objects by two degrees of freedom. Media objects are represented by content-based visual MPEG-7 descriptions. The implemented navigation scheme allows the user for switching between content index tree and time index tree without loosing the overview. Context information (position in the tree, content of next lower level, etc.) is permanently shown in auxiliary panels. The implementation is based on the scalable vector graphics standard (visualisation) and the MPEG-7 reference implementation. First evaluation results show that the proposed approach facilitates accessing video content in a novel way.

[1]  Erkki Oja,et al.  PicSOM - content-based image retrieval with self-organizing maps , 2000, Pattern Recognit. Lett..

[2]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[3]  Jens-Rainer Ohm,et al.  Application of MPEG-7 descriptors for temporal video segmentation , 2001, IS&T/SPIE Electronic Imaging.

[4]  Pasi Koikkalainen,et al.  Self-organizing hierarchical feature maps , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[5]  Ramesh C. Jain,et al.  Production model based digital video segmentation , 1995, Multimedia Tools and Applications.

[6]  Borko Furht,et al.  Content-Based Image and Video Retrieval , 2002, Multimedia Systems and Applications Series.

[7]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[8]  Stephen W. Smoliar,et al.  Video parsing and browsing using compressed data , 1995, Multimedia Tools and Applications.

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  Jorma Laaksonen,et al.  SOM_PAK: The Self-Organizing Map Program Package , 1996 .

[11]  Erkki Oja,et al.  Engineering applications of the self-organizing map , 1996, Proc. IEEE.

[12]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[13]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[14]  Borko Furht,et al.  Video and Image Processing in Multimedia Systems , 1995 .

[15]  B. S. Manjunath,et al.  Introduction to mpeg-7 , 2002 .

[16]  Michael S. Lew,et al.  Principles of Visual Information Retrieval , 2001, Advances in Pattern Recognition.

[17]  Ajay Divakaran,et al.  MPEG-7 visual motion descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[18]  Thomas S. Huang,et al.  Relevance Feedback Techniques in Image Retrieval , 2001, Principles of Visual Information Retrieval.

[19]  Miroslaw Bober,et al.  MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[20]  A FRAMEWORK VizIR A Framework for Visual Information Retrieval , 2003 .

[21]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .