Video abstraction based on the visual attention model and online clustering

With the fast evolution of digital video, research and development of new technologies are greatly needed to lower the cost of video archiving, cataloging and indexing, as well as improve the efficiency and accessibility of stored video sequences. A number of methods to respectively meet these requirements have been researched and proposed. As one of the most important research topics, video abstraction helps to enable us to quickly browse a large video database and to achieve efficient content access and representation. In this paper, a video abstraction algorithm based on the visual attention model and online clustering is proposed. First, shot boundaries are detected and key frames in each shot are extracted so that consecutive key frames in a shot have the same distance. Second, the spatial saliency map indicating the saliency value of each region of the image is generated from each key frame and regions of interest (ROI) is extracted according to the saliency map. Third, key frames, as well as their corresponding saliency map, are passed to a specific filter, and several thresholds are used so that the key frames containing less information are discarded. Finally, key frames are clustered using an online clustering method based on the features in ROIs. Experimental results demonstrate the performance and effectiveness of the proposed video abstraction algorithm.

[1]  Tahir Jameel,et al.  Content Based Video Retrieval Framework Using Dual Tree Complex Wavelet Transform , 2007 .

[2]  Georgios Tziritas,et al.  Equivalent Key Frames Selection Based on Iso-Content Principles , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[4]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[5]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[6]  Nikolaos D. Doulamis,et al.  Optimal content-based video decomposition for interactive video navigation , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[8]  Avideh Zakhor,et al.  Applications of Video-Content Analysis and Retrieval , 2002, IEEE Multim..

[9]  Stephen R. Gulliver,et al.  Introduction to special issue on eye-tracking applications in multimedia systems , 2007, TOMCCAP.

[10]  Jenq-Neng Hwang,et al.  Object-based video abstraction for video surveillance systems , 2002, IEEE Trans. Circuits Syst. Video Technol..

[11]  Magda B. Fayk,et al.  Particle swarm optimisation based video abstraction , 2010 .

[12]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[13]  Yu Xiu,et al.  Application of information theory in video abstraction extraction , 2010, 2010 The 2nd Conference on Environmental Science and Information Application Technology.

[14]  Zhu Ming,et al.  An efficient video similarity search strategy for video-on-demand systems , 2009, 2009 2nd IEEE International Conference on Broadband Network & Multimedia Technology.

[15]  Stefanos D. Kollias,et al.  A fuzzy video content representation for video summarization and content-based retrieval , 2000, Signal Process..

[16]  Aggelos K. Katsaggelos,et al.  MINMAX optimal video summarization , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Qi Tian,et al.  Semantic retrieval of video - review of research on video retrieval in meetings, movies and broadcast news, and sports , 2006, IEEE Signal Processing Magazine.

[18]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[19]  Klaus Schöffmann,et al.  Video Browsing Using Interactive Navigation Summaries , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.

[20]  José Manuel Menéndez,et al.  A unified model for techniques on video-shot transition detection , 2005, IEEE Transactions on Multimedia.

[21]  Lan-ying Huang An Approach of Key Frame Extraction Based on Mutual Information , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[22]  Zhi-Hua Zhou,et al.  Multi-View Video Summarization , 2010, IEEE Transactions on Multimedia.

[23]  Ahmed K. Elmagarmid,et al.  InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval , 2005, IEEE Transactions on Multimedia.

[24]  Yingying Zhu,et al.  Video browsing and retrieval based on multimodal integration , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[25]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[26]  Stefanos D. Kollias,et al.  Non-sequential video content representation using temporal variation of feature vectors , 2000, 2000 Digest of Technical Papers. International Conference on Consumer Electronics. Nineteenth in the Series (Cat. No.00CH37102).