Keyframe Extraction Using Local Visual Semantics in the Form of a Region Thesaurus

This paper presents an approach for efficient keyframe extraction, using local semantics in form of a region thesaurus. More specifically, certain MPEG-7 color and texture features are locally extracted from keyframe regions. Then, using a hierarchical clustering approach a local region thesaurus is constructed to facilitate the description of each frame in terms of higher semantic features. The thesaurus consists of the most common region types that are encountered within the video shot, along with their synonyms. These region types carry semantic information. Each keyframe is represented by a vector consisting of the degrees of confidence of the existence of all region types within this shot. Using this keyframe representation, the most representative keyframe is then selected for each shot. Where a single keyframe is not adequate, using the same algorithm and exploiting the presence of the region types of the visual thesaurus, more keyframes are extracted.

[1]  Jonathan Foote,et al.  Discriminative techniques for keyframe selection , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[2]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3]  Stefanos D. Kollias,et al.  A stochastic framework for optimal key frame extraction from MPEG video databases , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[4]  Stephen L. Chiu,et al.  Extracting Fuzzy Rules from Data for Function Approximation and Pattern Classification , 2000 .

[5]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[6]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[7]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Chia-Hung Yeh,et al.  Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques , 2006, IEEE Signal Processing Magazine.

[9]  Marcel Worring,et al.  Mediamill: Searching Multimedia Archives Based on Learned Semantics , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[10]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[11]  Jake K. Aggarwal,et al.  Image retrieval via isotropic and anisotropic mappings , 2001, Pattern Recognit..

[12]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[14]  Benoit Huet,et al.  Automatic video summarization , 2006 .

[15]  David G. Stork,et al.  Pattern Classification , 1973 .

[16]  Shree K. Nayar,et al.  Multiresolution histograms and their use for recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  António M. G. Pinheiro Edge Pixel Histograms Characterization with Neural Networks for an Improved Semantic Description , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[18]  Rong Yan,et al.  IBM multimedia analysis and retrieval system , 2008, CIVR '08.

[19]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[20]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[21]  Miroslaw Bober,et al.  MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[22]  Myungcheol Lee,et al.  Graph theory for image analysis: an approach based on the shortest spanning tree , 1986 .

[23]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[24]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[25]  Bertrand Le Saux,et al.  Image Classifiers for Scene Analysis , 2004, ICCVG.