Video scene segmentation by improved visual shot coherence

Nowadays, there a increasing interest in video scene segmentation due huge amount of videos available through services like YouTube. Although there are some techniques which obtain relatively good precision and recall values when segmenting the video in scenes, they are somewhat limited because the high computational cost. A well know technique to accomplish video scene segmentation is the shot coherence model, which presents lower precision and recall than state of art methods, like machine learning and multimodality, but stands out for being simple. The improvement of the techniques based on shot coherence models could be beneficial to these state of the art segmentation methods. That way, this paper presents a new technique for scene segmentation using shot coherence and optical flow features. The technique is presented and evaluated through a series of precision, recall and F1 values, obtaining results close or even better of those obtained by related works.

[1]  Seungyong Lee,et al.  Object motion based video key-frame extraction , 2010, SA '10.

[2]  Yiannis Kompatsiaris,et al.  Multi-modal scene segmentation using scene transition graphs , 2009, ACM Multimedia.

[3]  Alan F. Smeaton,et al.  Automatically Segmenting LifeLog Data into Events , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[4]  Doris A. Graber,et al.  Seeing is remembering: How visuals contribute to learning from television news , 1990 .

[5]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[6]  Marie-Francine Moens,et al.  Unsupervised scene detection in Olympic video using multi-modal chains , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Hui Chen,et al.  A practical method for video scene segmentation , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[8]  Xiang Cao,et al.  Neural Network Based Temporal Video Segmentation , 2002, Int. J. Neural Syst..

[9]  Matthieu Cord,et al.  Advanced Techniques in CBIR: Local Descriptors, Visual Dictionaries and Bags of Features , 2009, 2009 Tutorials of the XXII Brazilian Symposium on Computer Graphics and Image Processing.

[10]  Rudinei Goularte,et al.  Digital video scenes identification using audiovisual features , 2009, WebMedia.

[11]  Xiaoqin Zhang,et al.  Video Scene Segmentation Using Time Constraint Dominant-Set Clustering , 2010, MMM.

[12]  C. V. Jawahar,et al.  Video Scene Segmentation with a Semantic Similarity , 2011, IICAI.

[13]  Miki Haseyama,et al.  MCMC-based scene segmentation method using structure of video , 2010, 2010 10th International Symposium on Communications and Information Technologies.

[14]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[15]  Mubarak Shah,et al.  Scene detection in Hollywood movies and TV shows , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[16]  Liang Bai,et al.  Video shot boundary detection using Petri-Net , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[17]  Argyris Kalogeratos,et al.  Movie segmentation into scenes and chapters using locally weighted bag of visual words , 2009, CIVR '09.

[18]  Nikolas P. Galatsanos,et al.  Efficient Video Shot Summarization Using an Enhanced Spectral Clustering Approach , 2008, ICANN.

[19]  Jun Zhang,et al.  Video scene classification and segmentation based on Support Vector Machine , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[20]  Euee S. Jang,et al.  Adaptive Key Frame Selection for Efficient Video Coding , 2007, PSIVT.

[21]  Rudinei Goularte,et al.  Video shot representation based on histograms , 2013, SAC '13.

[22]  Marcelo G. Manzato,et al.  An enhanced content selection mechanism for personalization of video news programmes , 2010, Multimedia Systems.

[23]  Bin Wu,et al.  A Novel Horror Scene Detection Scheme on Revised Multiple Instance Learning Model , 2011, MMM.

[24]  Nicu Sebe,et al.  Personalization in multimedia retrieval: A survey , 2010, Multimedia Tools and Applications.

[25]  Shuwu Zhang,et al.  A multi-modal video analysis system , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[26]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[27]  John Zimmerman,et al.  Media Augmentation and Personalization Through Multimedia Processing and Information Extraction , 2004, Personalized Digital Television.

[28]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  John R. Kender,et al.  Video scene segmentation via continuous video coherence , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[30]  Paul Over,et al.  Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[31]  Irena Koprinska,et al.  Temporal video segmentation: A survey , 2001, Signal Process. Image Commun..

[32]  Evangelos A. Yfantis,et al.  An algorithm for key-frame determination in digital video , 2001, SAC.

[33]  Zhu Liu,et al.  Integration of audio and visual information for content-based video segmentation , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[34]  Shen Yan-mei Histogram-Based Color Image Retrieval , 2008 .