Eratosthenes sieve based key-frame extraction technique for event summarization in videos

The rapid growth of video data demands both effective and efficient video summarization methods so that users are empowered to quickly browse and comprehend a large amount of video content. It is a herculean task to manage access to video content in real time where humongous amount of audiovisual recorded data is generated every second. In this paper we propose an Eratosthenes Sieve based key-frame extraction approach for video summarization (VS) which can work better for real-time applications. Here, Eratosthenes Sieve is used to generate sets of all Prime number frames and nonprime number frames up to total N frames of a video. k-means clustering procedure is employed on these sets to extract the key–frames quickly. Here, the challenge is to find the optimal set of clusters, achieved by employing Davies-Bouldin Index (DBI). DBI a cluster validation technique which allows users with free parameter based VS approach to choose the desired number of key-frames without incurring additional computational costs. Moreover, our proposed approach includes likes of both local and global perspective videos. The method strongly enhances clustering procedure performance trough engagement of Eratosthenes Sieve. Qualitative and quantitative evaluation and complexity computation are done in order to compare the performances of the proposed model and state-of-the-art models. Experimental results on two benchmark datasets with various types of videos exhibit that the proposed methods outperform the state-of-the-art models on F-measure.

[1]  Fumin Shen,et al.  Spatial and temporal scoring for egocentric video summarization , 2016, Neurocomputing.

[2]  Marco Pellegrini,et al.  STIMO: STIll and MOving video storyboard for the web scenario , 2009, Multimedia Tools and Applications.

[3]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[4]  Xin Liu,et al.  Video summarization using singular value decomposition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Dian Tjondronegoro,et al.  Integrating Highlights for More Complete Sports Video Summarization , 2004 .

[6]  F. Cajori,et al.  THE SIEVE OF ERATOSTHENES. , 1928, Science.

[7]  Yi-Ping Phoebe Chen,et al.  Highlights for more complete sports video summarization , 2004, IEEE MultiMedia.

[8]  Ziyou Xiong,et al.  9.2 – A Unified Framework for Video Summarization, Browsing, and Retrieval , 2005 .

[9]  Patrick Pérez,et al.  Rapid Summarisation and Browsing of Video Sequences , 2002, BMVC.

[10]  R. Brunelli,et al.  A Survey on the Automatic Indexing of Video Data, , 1999, J. Vis. Commun. Image Represent..

[11]  Yue Gao,et al.  Multimedia Social Event Detection in Microblog , 2015, MMM.

[12]  Shiqiang Yang,et al.  Contextual browsing for highlights in sports video , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[13]  Serhan Dagtas,et al.  Multimodal detection of highlights for multimedia content , 2004, Multimedia Systems.

[14]  Yelena Yesha,et al.  Keyframe-based video summarization using Delaunay clustering , 2006, International Journal on Digital Libraries.

[15]  Eric P. Xing,et al.  Joint Summarization of Large-Scale Collections of Web Images and Videos for Storyline Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Arnaldo de Albuquerque Araújo,et al.  VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method , 2011, Pattern Recognit. Lett..

[17]  Hongxun Yao,et al.  Flexible Presentation of Videos Based on Affective Content Analysis , 2013, MMM.

[18]  Yue Gao,et al.  Exploring Principles-of-Art Features For Image Emotion Recognition , 2014, ACM Multimedia.

[19]  Qiang Ji,et al.  Video Affective Content Analysis: A Survey of State-of-the-Art Methods , 2015, IEEE Transactions on Affective Computing.

[20]  Nicu Sebe,et al.  Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation , 2016, IEEE Transactions on Image Processing.

[21]  Alberto Del Bimbo,et al.  Semantic annotation of soccer videos: automatic highlights identification , 2003, Comput. Vis. Image Underst..

[22]  Stefanos D. Kollias,et al.  A fuzzy video content representation for video summarization and content-based retrieval , 2000, Signal Process..

[23]  Mohan S. Kankanhalli,et al.  Video Summarization Using R-Sequences , 2000, Real Time Imaging.

[24]  Regunathan Radhakrishnan,et al.  A Unified Framework for Video Summarization, Browsing & Retrieval: with Applications to Consumer and Surveillance Video , 2005 .

[25]  Andrea Cavallaro,et al.  Resource Allocation for Personalized Video Summarization , 2014, IEEE Transactions on Multimedia.

[26]  Lifang Gu,et al.  Replay Detection in Sports Video Sequences , 1999, Eurographics Multimedia Workshop.

[27]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[28]  Sang Uk Lee,et al.  Efficient video indexing scheme for content-based retrieval , 1999, IEEE Trans. Circuits Syst. Video Technol..

[29]  Jintao Li,et al.  Replay boundary detection in MPEG compressed video , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[30]  Ziyou Xiong,et al.  Effective and efficient sports highlights extraction using the minimum description length criterion in selecting GMM structures [audio classification] , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[31]  Yue Gao,et al.  Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression , 2017, IEEE Transactions on Multimedia.

[32]  Peter J. L. van Beek,et al.  Detection of slow-motion replay segments in sports video for highlights generation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[33]  R. K. Agrawal,et al.  A novel position prior using fusion of rule of thirds and image center for salient object detection , 2016, Multimedia Tools and Applications.

[34]  Ananda S. Chowdhury,et al.  Video storyboard design using Delaunay graphs , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[35]  Ullas Gargi,et al.  Performance characterization of video-shot-change detection methods , 2000, IEEE Trans. Circuits Syst. Video Technol..

[36]  Mei Han,et al.  Extract highlights from baseball game video with hidden Markov models , 2002, Proceedings. International Conference on Image Processing.

[37]  Jingkuan Song,et al.  Learning in high-dimensional multimedia data: the state of the art , 2015, Multimedia Systems.

[38]  Georgios Tziritas,et al.  Equivalent Key Frames Selection Based on Iso-Content Principles , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  R. K. Agrawal,et al.  A novel hybrid approach for salient object detection using local and global saliency in frequency domain , 2015, Multimedia Tools and Applications.

[40]  Shiyang Lu,et al.  Keypoint-Based Keyframe Selection , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[41]  Regunathan Radhakrishnan,et al.  Effective and efficient sports highlights extraction using the minimum description length criterion in selecting GMM structures , 2004, ICME.

[42]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[43]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Yue Gao,et al.  Predicting Personalized Emotion Perceptions of Social Images , 2016, ACM Multimedia.

[45]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[46]  Shaohui Mei,et al.  Video summarization via minimum sparse reconstruction , 2015, Pattern Recognit..

[47]  Changsheng Xu,et al.  A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video , 2008, IEEE Transactions on Multimedia.

[48]  Jiebo Luo,et al.  Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.

[49]  Behzad Shahraray,et al.  Automatic generation of pictorial transcripts of video programs , 1995, Electronic Imaging.

[50]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Loong Fah Cheong,et al.  Affective understanding in film , 2006, IEEE Trans. Circuits Syst. Video Technol..