Shot Boundary Detection with Spatial-Temporal Convolutional Neural Networks

Nowadays, digital videos have been widely leveraged to record and share various events and people’s daily life. It becomes urgent to provide automatic video semantic analysis and management for convenience. Shot boundary detection (SBD) plays a key fundamental role in various video analysis. Shot boundary detection aims to automatically detecting boundary frames of shots in videos. In this paper, we propose a progressive method for shot boundary detecting with histogram based shot filtering and C3D based gradual shot detection. Abrupt shots were detected firstly for its specialty and help alleviate locating shots across different shots by dividing the whole video into segments. Then, over the segments, gradual shot detection is implemented via a three-dimensional convolutional neural network model, which assign video clips into shot types of normal, dissolve, foi or swipe. Finally, for untrimmed videos, a frame level merging strategy is constructed to help locate the boundary of shots from neighboring frames. The experimental results demonstrate that the proposed method can effectively detect shots and locate their boundaries.

[1]  Wojciech Matusik,et al.  Large-scale, Fast and Accurate Shot Boundary Detection through Spatio-temporal Convolutional Neural Networks , 2017, ArXiv.

[2]  James Ze Wang,et al.  Unsupervised Multiresolution Segmentation for Images with Low Depth of Field , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Sudipta Roy,et al.  Video shot boundary detection: A review , 2015, 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT).

[4]  Michael Gygli,et al.  Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks , 2017, 2018 International Conference on Content-Based Multimedia Indexing (CBMI).

[5]  King Ngi Ngan,et al.  High accuracy flashlight scene determination for shot boundary detection , 2003, Signal Process. Image Commun..

[6]  Ramin Zabih,et al.  A feature-based algorithm for detecting and classifying scene breaks , 1995, MULTIMEDIA '95.

[7]  Paul Over,et al.  Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[8]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[9]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[10]  Feng Hong-cai,et al.  A Shot Boundary Detection Method Based on Color Space , 2010, 2010 International Conference on E-Business and E-Government.

[11]  Rong Xie,et al.  Shot boundary detection using convolutional neural networks , 2016, 2016 Visual Communications and Image Processing (VCIP).

[12]  Nicole Vincent,et al.  Efficient and robust shot change detection , 2007, Journal of Real-Time Image Processing.

[13]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Jianyu Wang,et al.  A Self-adapting Dual-threshold Method for Video Shot Transition Detection , 2008, 2008 IEEE International Conference on Networking, Sensing and Control.

[16]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[17]  Takafumi Miyatake,et al.  IMPACT: an interactive natural-motion-picture dedicated multimedia authoring system , 1991, CHI.

[18]  Cheng Cai,et al.  TRECVID2005 Experiments in The Hong Kong Polytechnic University: Shot Boundary Detection Based on a Multi-Step Comparison Scheme , 2005, TRECVID.