Unsupervised Video Shot Detection Using Clustering Ensemble with a Color Global Scale-Invariant Feature Transform Descriptor

Scale-invariant feature transform (SIFT) transforms a grayscale image into scale-invariant coordinates of local features that are invariant to image scale, rotation, and changing viewpoints. Because of its scale-invariant properties, SIFT has been successfully used for object recognition and content-based image retrieval. The biggest drawback of SIFT is that it uses only grayscale information and misses important visual information regarding color. In this paper, we present the development of a novel color feature extraction algorithm that addresses this problem, and we also propose a new clustering strategy using clustering ensembles for video shot detection. Based on Fibonacci lattice-quantization, we develop a novel color global scale-invariant feature transform (CGSIFT) for better description of color contents in video frames for video shot detection. CGSIFT first quantizes a color image, representing it with a small number of color indices, and then uses SIFT to extract features from the quantized color index image. We also develop a new space description method using small image regions to represent global color features as the second step of CGSIFT. Clustering ensembles focusing on knowledge reuse are then applied to obtain better clustering results than using single clustering methods for video shot detection. Evaluation of the proposed feature extraction algorithm and the new clustering strategy using clustering ensembles reveals very promising results for video shot detection.

[1]  Kuo C. Jay Video Content Analysis Using Multimodal Information: For Movie Content Extraction, Indexing and Representation , 2003 .

[2]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[3]  N. Nikolaidis,et al.  Video shot detection and condensed representation. a review , 2006, IEEE Signal Processing Magazine.

[4]  Wallapak Tavanapong,et al.  Shot clustering techniques for story browsing , 2004, IEEE Transactions on Multimedia.

[5]  Songyang Lao,et al.  AnchorClu: An Anchorperson Shot Detection Method Based on Clustering , 2005, Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT'05).

[6]  Gang Wang,et al.  A new video retrieval approach based on clustering , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[7]  Guo-Hui Li,et al.  Video Hierarchical Structure Mining , 2006, 2006 International Conference on Communications, Circuits and Systems.

[8]  I. Andreadis,et al.  Colour histogram content-based image retrieval and hardware implementation , 2003 .

[9]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[10]  Xiangyang Xue,et al.  Shot boundary detection using unsupervised clustering and hypothesis testing , 2004, 2004 International Conference on Communications, Circuits and Systems (IEEE Cat. No.04EX914).

[11]  Stefan B. Williams,et al.  Reduced SIFT Features For Image Retrieval And Indoor Localisation , 2004 .

[12]  Aleksandra Mojsilovic,et al.  Color Quantization and Processing by Fibonacci Lattices , 2022 .

[13]  Chi-Chun Lo,et al.  Video segmentation using a histogram-based fuzzy c-means clustering algorithm , 2001, Comput. Stand. Interfaces.

[14]  Koji Nakano,et al.  An image retrieval system using FPGAs , 2003, ASP-DAC '03.

[15]  Noboru Babaguchi,et al.  Video clustering using spatio-temporal image with fixed length , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[16]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Chong-Wah Ngo,et al.  On clustering and retrieval of video shots through temporal slices analysis , 2002, IEEE Trans. Multim..

[18]  Thomas Deselaers,et al.  Features for Image Retrieval , 2003 .

[19]  Patrice Quinton,et al.  Acceleration of a content-based image-retrieval application on the RDISK cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[20]  Alan F. Smeaton,et al.  Clustering-Based Analysis of Semantic Concept Models for Video Shots , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[21]  Paul S. Heckbert Color image quantization for frame buffer display , 1998 .

[22]  NeyHermann,et al.  Features for image retrieval , 2008 .

[23]  Ludmila I. Kuncheva,et al.  Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jun Xiao,et al.  Unsupervised video segmentation method based on feature distance , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[25]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[27]  Michael Gervautz,et al.  A simple method for color quantization: octree quantization , 1990 .

[28]  고윤호,et al.  클러스터링을 이용한 급격한 장면 전환 검출 기법 ( Abrupt Shot Change Detection using an Unsupervised Clustering of Multiple Features ) , 2001 .

[29]  Thomas S. Huang,et al.  Constructing table-of-content for videos , 1999, Multimedia Systems.

[30]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[31]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[32]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[33]  Shiqiang Yang,et al.  Improving classification of video shots using information-theoretic co-clustering , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[34]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.