Efficient visual attention based framework for extracting key frames from videos

The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries.

[1]  Yang Yi,et al.  Key frame extraction based on visual attention model , 2012, J. Vis. Commun. Image Represent..

[2]  Marco Pellegrini,et al.  STIMO: STIll and MOving video storyboard for the web scenario , 2009, Multimedia Tools and Applications.

[3]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[4]  Chu-Song Chen,et al.  Fast Human Detection Using a Novel Boosted Cascading Structure With Meta Stages , 2008, IEEE Transactions on Image Processing.

[5]  Arnaldo de Albuquerque Araújo,et al.  VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method , 2011, Pattern Recognit. Lett..

[6]  David S. Doermann,et al.  Video summarization by curve simplification , 1998, MULTIMEDIA '98.

[7]  Guoliang Fan,et al.  Combined key-frame extraction and object-based video segmentation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[9]  Nuno Vasconcelos,et al.  On the plausibility of the discriminant center-surround hypothesis for visual saliency. , 2008, Journal of vision.

[10]  Changsheng Xu,et al.  Cross-Domain Feature Learning in Multimedia , 2015, IEEE Transactions on Multimedia.

[11]  Xian-Sheng Hua,et al.  An Attention-Based Decision Fusion Scheme for Multimedia Information Retrieval , 2004, PCM.

[12]  Alexander Toet,et al.  Computational versus Psychophysical Bottom-Up Image Saliency: A Comparative Evaluation Study , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Whoi-Yul Kim,et al.  Automatic video summarizing tool using MPEG-7 descriptors for personal video recorder , 2003, IEEE Trans. Consumer Electron..

[14]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[15]  Eung-Kwan Kang,et al.  Video retrieval based on scene change detection in compressed streams , 1999, IEEE Trans. Consumer Electron..

[16]  Sung Wook Baik,et al.  Adaptive key frame extraction for video summarization using an aggregation mechanism , 2012, J. Vis. Commun. Image Represent..

[17]  Danny Crookes,et al.  Hierarchical video summarization in reference subspace , 2009, IEEE Transactions on Consumer Electronics.

[18]  Tie-Yan Liu,et al.  Dynamic selection and effective compression of key frames for video abstraction , 2003, Pattern Recognit. Lett..

[19]  Andreas Girgensohn,et al.  Keyframe-Based User Interfaces for Digital Video , 2001, Computer.

[20]  Yelena Yesha,et al.  Keyframe-based video summarization using Delaunay clustering , 2006, International Journal on Digital Libraries.

[21]  Sung Wook Baik,et al.  Video summarization using a network of radial basis functions , 2012, Multimedia Systems.

[22]  Changsheng Xu,et al.  A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video , 2008, IEEE Transactions on Multimedia.

[23]  Yue Gao,et al.  A video summarization tool using two-level redundancy detection for personal video recorders , 2008, IEEE Transactions on Consumer Electronics.

[24]  Junggab Son,et al.  PVR: a novel PVR scheme for content protection , 2011, IEEE Transactions on Consumer Electronics.

[25]  Daniel Sánchez,et al.  Modelling subjectivity in visual perception of orientation for image retrieval , 2003, Inf. Process. Manag..

[26]  Sarah Walker,et al.  Ultra-rapid categorization requires visual attention: Scenes with multiple foreground objects. , 2008, Journal of vision.

[27]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[29]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[30]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[31]  Jiang Peng,et al.  Keyframe-Based Video Summary Using Visual Attention Clues , 2010 .

[32]  Christophe De Vleeschouwer,et al.  An Autonomous Framework to Produce and Distribute Personalized Team-Sport Video Summaries: A Basketball Case Study , 2011, IEEE Transactions on Multimedia.

[33]  Patrick P. K. Chan,et al.  A novel method to reduce redundancy in adaptive threshold clustering key frame extraction systems , 2011, 2011 International Conference on Machine Learning and Cybernetics.

[34]  Janko Calic,et al.  Spatial analysis in key-frame extraction using video segmentation , 2004 .