Multi-scale contrast and relative motion-based key frame extraction

The huge amount of video data available these days requires effective management techniques for storage, indexing, and retrieval. Video summarization, a method to manage video data, provides concise versions of the videos for efficient browsing and retrieval. Key frame extraction is a form of video summarization which selects only the most salient frames from a given video. Since the automatic semantic understanding of the video contents is not possible so far, most of the existing works employ low level index features for extracting key frames. However, the usage of low level features results in loss of semantic details, thus leading to a semantic gap. In this context, the saliency-based user attention modeling technique can be used to bridge this semantic gap. In this paper, a key frame extraction scheme based on a visual attention mechanism is proposed. The proposed scheme builds static visual attention method based on multi-scale contrast instead of usual color contrast. The dynamic visual attention model is developed based on novel relative motion intensity and relative motion orientation. An efficient fusion scheme for combining three visual attention values is then proposed. A flexible technique is then used for key frame extraction. The experimental results demonstrate that the proposed mechanism provides excellent results as compared to the some of the other prominent techniques in the literature.

[1]  Arnaldo de Albuquerque Araújo,et al.  VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method , 2011, Pattern Recognit. Lett..

[2]  Sung Wook Baik,et al.  Efficient visual attention based framework for extracting key frames from videos , 2013, Signal Process. Image Commun..

[3]  Tie-Yan Liu,et al.  Shot reconstruction degree: a novel criterion for key frame selection , 2004, Pattern Recognit. Lett..

[4]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Kristen Grauman,et al.  Story-Driven Summarization for Egocentric Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Bin Zhao,et al.  Quasi Real-Time Summarization for Consumer Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David S. Doermann,et al.  Video summarization by curve simplification , 1998, MULTIMEDIA '98.

[8]  Ali Farhadi,et al.  Ranking Domain-Specific Highlights by Analyzing Edited Videos , 2014, ECCV.

[9]  Daniel Sánchez,et al.  Modelling subjectivity in visual perception of orientation for image retrieval , 2003, Inf. Process. Manag..

[10]  Jonathan Foote,et al.  Discriminative techniques for keyframe selection , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[11]  Shigeo Morishima,et al.  Court-Based Volleyball Video Summarization Focusing on Rally Scene , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  C. Schmid,et al.  Category-Specific Video Summarization , 2014, ECCV.

[13]  Yale Song,et al.  Video co-summarization: Video summarization by visual co-occurrence , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[15]  Stan Z. Li,et al.  Online content-aware video condensation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Li Sun,et al.  Event-based large scale surveillance video summarization , 2016, Neurocomputing.

[17]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[18]  Peng Jiang,et al.  Keyframe-Based Video Summary Using Visual Attention Clues , 2010, IEEE Multim..

[19]  Seong-Dae Kim,et al.  Iterative key frame selection in the rate-constraint environment , 2003, Signal Process. Image Commun..

[20]  Junsong Yuan,et al.  From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sung Wook Baik,et al.  Adaptive key frame extraction for video summarization using an aggregation mechanism , 2012, J. Vis. Commun. Image Represent..

[22]  Yelena Yesha,et al.  Keyframe-based video summarization using Delaunay clustering , 2006, International Journal on Digital Libraries.

[23]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[24]  Gunhee Kim,et al.  Storyline Representation of Egocentric Videos with an Applications to Story-Based Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  H. P. Meana,et al.  A fast and effective method for static video summarization on compressed domain , 2016, IEEE Latin America Transactions.

[26]  Ke Zhang,et al.  Summary Transfer: Exemplar-Based Subset Selection for Video Summarization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Fadi Dornaika,et al.  Decremental Sparse Modeling Representative Selection for prototype selection , 2015, Pattern Recognit..

[28]  Guillermo Sapiro,et al.  See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Luc Van Gool,et al.  Video summarization by learning submodular mixtures of objectives , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[31]  Alexander Toet,et al.  Computational versus Psychophysical Bottom-Up Image Saliency: A Comparative Evaluation Study , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Marco Pellegrini,et al.  STIMO: STIll and MOving video storyboard for the web scenario , 2009, Multimedia Tools and Applications.

[33]  Ananda S. Chowdhury,et al.  Scalable Video Summarization Using Skeleton Graph and Random Walk , 2014, 2014 22nd International Conference on Pattern Recognition.

[34]  Yang Yi,et al.  Key frame extraction based on visual attention model , 2012, J. Vis. Commun. Image Represent..