Efficient Filtering and Clustering Methods for Temporal Video Segmentation and Visual Summarization

Automatic temporal segmentation and visual summary generation methods that require minimal user interaction are key requirements in video information management systems. Clustering presents an ideal method for achieving these goals, as it allows direct integration of multiple information sources. This paper proposes a clustering-based framework to achieve these tasks automatically and with a minimum of user-defined parameters. The use of multiple frame difference features and short-time techniques are presented for efficient detection of cut-type shot boundaries. Generic temporal filtering methods are used to process the signals used in shot boundary detection, resulting in better suppression of false alarms. Clustering is also extended to the key frame extraction problem: Color-based shot representations are provided by average and intersection histograms, which are then used in a clustering scheme to identify reference key frames within each slot. The technique achieves good compaction with a minimum number of visually nonredundant key frames.

[1]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[2]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[3]  Sethuraman Panchanathan,et al.  Review of Image and Video Indexing Techniques , 1997, J. Vis. Commun. Image Represent..

[4]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[5]  Thomas D. C. Little,et al.  A Survey of Technologies for Parsing and Indexing Digital Video1 , 1996, J. Vis. Commun. Image Represent..

[6]  P. Anandan,et al.  Mosaic based representations of video sequences and their applications , 1995, Proceedings of IEEE International Conference on Computer Vision.

[7]  A. Murat Tekalp,et al.  A high-performance shot boundary detection algorithm using multiple cues , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[9]  B. Kawin,et al.  How Movies Work , 1987 .

[10]  Marco Ceccarelli,et al.  Automation of systems enabling search on stored video data , 1997, Electronic Imaging.

[11]  Dragutin Petkovic,et al.  Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review , 1996 .

[12]  A. Murat Tekalp,et al.  Multiscale content extraction and representation for video indexing , 1997, Other Conferences.

[13]  A. Murat Tekalp,et al.  Video indexing through integration of syntactic and semantic features , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[14]  Ramesh C. Jain,et al.  Knowledge-guided parsing in video databases , 1993, Electronic Imaging.

[15]  Arding Hsu,et al.  Image processing on compressed data for large video databases , 1993, MULTIMEDIA '93.

[16]  John S. Boreczky,et al.  Comparison of video shot boundary detection techniques , 1996, J. Electronic Imaging.

[17]  A. Murat Tekalp,et al.  Temporal video segmentation using unsupervised clustering and semantic object tracking , 1998, J. Electronic Imaging.

[18]  Jae S. Lim,et al.  Two-Dimensional Signal and Image Processing , 1989 .

[19]  Minerva M. Yeung,et al.  Efficient matching and clustering of video shots , 1995, Proceedings., International Conference on Image Processing.

[20]  Barry Salt,et al.  Film Style and Technology: History and Analysis , 1983 .

[21]  D. C. Coll,et al.  Image Activity Characteristics in Broadcast Television , 1976, IEEE Trans. Commun..

[22]  F. Billmeyer,et al.  Principles of color technology , 1967 .