Video clustering using spatio-temporal image with fixed length

In order to handle video media such as TV programs efficiently and effectively, we need to segment a video stream into video segments and structuralize them based on their contents. We focus on similarity, which is one of the important relations between video segments, and describe a method to cluster similar segments in a video stream. The conventional clustering methods are based on shots, but no complete method to detect shot boundaries has yet been established. Our method is based on fixed length video stream segments, called video packets. Generating spatio-temporal images, we employ cooccurrence matrices to express features in the time dimension explicitly. From clustering experiments for actual TV programs, we obtained clustering accuracy of 81%.

[1]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[2]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[3]  Chong-Wah Ngo,et al.  On clustering and retrieval of video shots , 2001, MULTIMEDIA '01.

[4]  Boon-Lock Yeo,et al.  Time-constrained clustering for segmentation of video into story units , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[5]  Boon-Lock Yeo,et al.  Retrieving and visualizing video , 1997, CACM.

[6]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..