A perceptual scheme for fully automatic video shot boundary detection

In this paper, we propose a novel and robust modus operandi for fast and accurate shot boundary detection where the whole design philosophy is based on human perceptual rules and the well-known ''Information Seeking Mantra''. By adopting a top-down approach, redundant video processing is avoided and furthermore elegant shot boundary detection accuracy is obtained under significantly low computational costs. Objects within shots are detected via local image features and used for revealing visual discontinuities among shots. The proposed method can be used for detecting all types of gradual transitions as well as abrupt changes. Another important feature is that the proposed method is fully generic, which can be applied to any video content without requiring any training or tuning in advance. Furthermore, it allows a user interaction to direct the SBD process to the user's ''Region of Interest'' or to stop it once satisfactory results are obtained. Experimental results demonstrate that the proposed algorithm achieves superior computational times compared to the state-of-art methods without sacrificing performance.

[1]  H. Barlow Vision Science: Photons to Phenomenology by Stephen E. Palmer , 2000, Trends in Cognitive Sciences.

[2]  Gary Marchionini,et al.  Open video: A framework for a test collection , 2000, J. Netw. Comput. Appl..

[3]  Boon-Lock Yeo,et al.  A unified approach to temporal segmentation of motion JPEG and MPEG compressed video , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[4]  Rainer Lienhart,et al.  Reliable dissolve detection , 2001, IS&T/SPIE Electronic Imaging.

[5]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[6]  Paul Over,et al.  Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[7]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[8]  Tom Drummond,et al.  Fusing points and lines for high performance tracking , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Tat-Seng Chua,et al.  An unified framework for shot boundary detection via active learning , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Bo Zhang,et al.  A Formal Study of Shot Boundary Detection , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Adrian Kaehler,et al.  Learning opencv, 1st edition , 2008 .

[12]  Jian Feng,et al.  Scene change detection algorithm for MPEG video sequence , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[13]  Nilesh V. Patel,et al.  Statistical approach to scene change detection , 1995, Electronic Imaging.

[14]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[15]  Hain-Ching Liu,et al.  Automatic determination of scene changes in MPEG compressed video , 1995, Proceedings of ISCAS'95 - International Symposium on Circuits and Systems.

[16]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[17]  Moncef Gabbouj,et al.  Demo. Video shot boundary detection by structural analysis of local image features , 2011, WIAMIS 2011.

[18]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[19]  Yu Meng,et al.  A shot boundary detection algorithm based on Particle Swarm Optimization Classifier , 2009, 2009 International Conference on Machine Learning and Cybernetics.

[20]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[21]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[22]  Edward J. Delp,et al.  A fast algorithm for video parsing using MPEG compressed sequences , 1995, Proceedings., International Conference on Image Processing.

[23]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[24]  Bede Liu,et al.  Temporal segmentation of video using frame and histogram space , 2000, IEEE Transactions on Multimedia.

[25]  Moncef Gabbouj,et al.  Neighborhood matching for object recognition algorithms based on local image features , 2011, 2011 Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE).

[26]  Sang Wook Lee,et al.  Shot boundary detection using scale invariant feature matching , 2006, Electronic Imaging.

[27]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[28]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[29]  Shih-Fu Chang,et al.  Scene change detection in an MPEG-compressed video sequence , 1995, Electronic Imaging.

[30]  Ullas Gargi,et al.  Performance characterization of video-shot-change detection methods , 2000, IEEE Trans. Circuits Syst. Video Technol..

[31]  Arding Hsu,et al.  Feature management for large video databases , 1993, Electronic Imaging.

[32]  Pau-Choo Chung,et al.  Contrast context histogram - An efficient discriminating local descriptor for object recognition and image matching , 2008, Pattern Recognit..

[33]  Yihong Gong,et al.  Video parsing using compressed data , 1994, Electronic Imaging.

[34]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[35]  Neil M. Robertson,et al.  A Comparison of Feature Detectors with Passive and Task-Based Visual Saliency , 2009, SCIA.

[36]  T. Tuytelaars,et al.  A Survey on Local Invariant Features , 2006 .

[37]  Chun-Rong Huang,et al.  Shot Change Detection via Local Keypoint Matching , 2008, IEEE Transactions on Multimedia.

[38]  Angelo Chianese,et al.  Foveated shot detection for video segmentation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Shaohua Teng,et al.  Video Temporal Segmentation Using Support Vector Machine , 2008, AIRS.

[40]  Roy Thompson,et al.  Grammar of the Shot , 1998 .

[41]  Chong-Wah Ngo,et al.  A robust dissolve detector by support vector machine , 2003, ACM Multimedia.

[42]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[43]  Mohan S. Kankanhalli,et al.  Temporal multiresolution analysis for video segmentation , 1999, Electronic Imaging.

[44]  F. Arman,et al.  A Statistical Approach to Scene Change Detection , 1995 .

[45]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.