Video rushes summarization using spectral clustering and sequence alignment

In this paper we describe a system for video rushes summarization. The basic problems of rushes videos are three. First, the presence of useless frames such as colorbars, monochrome frames and frames containing clapboards. Second, the repetition of similar segments produced from multiple takes of the same scene and finally, the efficient representation of the original video in the video summary. In the method we proposed herein, the input video is segmented into shots. Then, colorbars and monochrome frames are removed by checking their edge direction histogram, whereas frames containing clapboards are removed by checking their SIFT descriptors. Next, an enhanced spectral clustering algorithm that both estimates the number of clusters and employs the fast global k-means algorithm in the clustering stage after the eigenvector computation of the similarity matrix is used to extract the key-frames of each shot, to efficiently represent shot content. Similar shots are clustered in one group by comparing their key-frames using a sequence alignment algorithm. Each group is represented from the shot with the largest duration and the final video summary is generated by concatenating frames around the key-frames of each shot. Experiments on TRECVID 2008 Test Data indicate that our method exhibits good performance.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3]  Paul Over,et al.  The trecvid 2008 BBC rushes summarization evaluation , 2008, TVS '08.

[4]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[5]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[6]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..