Innovative Shot Boundary Detection for Video Indexing

Recently, multimedia information, especially video data, has been made overwhelmingly accessible with the rapid advances in communication and multimedia computing technologies. Video is popular in many applications, which makes the efficient management and retrieval of the growing amount of video information very important. Toward such a demand, an effective video shot boundary detection method is necessary, which is a fundamental operation required in many multimedia applications. In this chapter, an innovative shot boundary detection method using an unsupervised segmentation algorithm and the technique of object tracking based on the segmentation mask maps is presented. A series of experiments on various types of video types are performed, and the experimental results show that our method can obtain object-level information of the video frames as well as accurate shot boundary detection, which are very useful for video content indexing. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.irm-press.com IRM PRE S This chapter appears in the book, Video Data Management and Information Retrieval by Sagarmay Deb. Copyright © 2005, IRM Press, an imprint of Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 218 Chen, Shyu, & Zhang Copyright © 2005, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. INTRODUCTION Unlike traditional database systems that have text or numerical data, a multimedia database or information system may contain different media such as text, image, audio, and video. Video, in particular, has become more and more popular in many applications such as education and training, video conferencing, video-on-demand (VOD), and news services. The traditional way for the users to search for certain content in a video is to fast-forward or rewind, which are sequential processes, making it difficult for the users to browse a video sequence directly based on their interests. Hence, it becomes important to be able to organize video data and provide the visual content in compact forms in multimedia applications (Zabih, Miller, & Mai, 1995). In many multimedia applications such as digital libraries and VOD, video shot boundary detection is fundamental and must be performed prior to all other processes (Shahraray, 1995; Zhang & Smoliar, 1994). A video shot is a video sequence that consists of continuous video frames for one action, and shot boundary detection is an operation to divide the video data into physical video shots. Many video shot boundary detection methods have been proposed in the literature. Most of them use low-level global features in the matching process between two consecutive frames for shot boundary detection, for example, using the luminance pixel-wise difference (Zhang, Kankanhalli, & Smoliar, 1993), luminance or color histogram difference (Swanberg, Shu, & Jain, 1993), edge difference (Zabih et al., 1995), and the orientation histogram (Ngo, Pong, & Chin, 2000). However, these low-level features cannot provide satisfactory results for shot boundary detection since luminance or color is sensitive to small changes. For example, Yeo and Liu (1995) proposed a method that uses the luminance histogram difference of DC images, which is very sensitive to luminance changes. There are also approaches focusing on the compressed video data domain. For example, Lee, Kim, and Choi (2000) proposed a fast scene/shot change detection method, and Hwang and Jeong (1998) proposed the directional information retrieving method by using the discrete cosine transform (DCT) coefficients in MPEG video data. In addition, dynamic and adaptive threshold determination is also applied to enhance the accuracy and robustness of the existing techniques in shot cuts detection (Alattar, 1997; Gunsel, Ferman, & Tekalp, 1998; Truong, Dorai, & Venkatesh, 2000). In Gunsel et al. (1998), the unsupervised clustering algorithm proposed a generic technique that does not need threshold setting and allows multiple features to be used simultaneously; while an adaptive threshold determination method that reduces the artifacts created by noise and motion in shot change detection was proposed by Truong et al. (2000). In this chapter, we present an innovative shot boundary detection method using an unsupervised image-segmentation algorithm and the object-tracking technique on the uncompressed video data. In our method, the image-segmentation algorithm extracts the segmentation mask map of each video frame automatically, which can be deemed as the clustering feature map of each frame and where the pixels in each frame have been grouped into different classes (e.g., two classes). Then the difference between the segmentation mask maps of two frames is checked. Moreover, due to camera panning and tilting, we propose an object-tracking method based on the segmentation results to enhance the matching. The cost for object tracking is almost trivial since the segmentation results are already available. In addition, the bounding boxes and the positions of 18 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/chapter/innovative-shot-boundarydetection-video/30767?camid=4v1 This title is available in InfoSci-Books, InfoSci-Database Technologies, Library Science, Information Studies, and Education, InfoSci-Library Information Science and Technology. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=1

[1]  Ba Tu Truong,et al.  New enhancements to cut, fade, and dissolve detection processes in video segmentation , 2000, ACM Multimedia.

[2]  Arding Hsu,et al.  Image processing on compressed data for large video databases , 1993, MULTIMEDIA '93.

[3]  Rangasami L. Kashyap,et al.  Augmented Transition Network as a Semantic Model for Video Data , 2001 .

[4]  Adnan M. Alattar Detecting fade regions in uncompressed video sequences , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ullas Gargi,et al.  Performance characterization and comparison of video indexing algorithms , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[6]  Behzad Shahraray,et al.  Scene change detection and content-based sampling of video sequences , 1995, Electronic Imaging.

[7]  Borko Furht,et al.  Video and Image Processing in Multimedia Systems , 1995 .

[8]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[9]  Rangasami L. Kashyap,et al.  Augmented transition networks as video browsing models for multimedia databases and multimedia information systems , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[10]  Wei Xiong,et al.  Efficient Scene Change Detection and Camera Motion Annotation for Video Classification , 1998, Comput. Vis. Image Underst..

[11]  Stephen W. Smoliar,et al.  Developing power tools for video indexing and retrieval , 1994, Electronic Imaging.

[12]  Didier Le Gall,et al.  MPEG: a video compression standard for multimedia applications , 1991, CACM.

[13]  Ramin Zabih,et al.  Comparing images using joint histograms , 1999, Multimedia Systems.

[14]  Chong-Wah Ngo,et al.  Motion-Based Video Representation for Scene Change Detection , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[15]  Ramin Zabih,et al.  A feature-based algorithm for detecting and classifying scene breaks , 1995, MULTIMEDIA '95.

[16]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[17]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[18]  Rangasami L. Kashyap,et al.  Indexing and searching structure for multimedia database systems , 1999, Electronic Imaging.

[19]  A. Murat Tekalp,et al.  Temporal video segmentation using unsupervised clustering and semantic object tracking , 1998, J. Electronic Imaging.

[20]  Michael J. Swain,et al.  Interactive indexing into image databases , 1993, Electronic Imaging.

[21]  Young-Min Kim,et al.  Fast Scene Change Detection using Direct Feature Extraction from MPEG Compressed Videos , 2000, IEEE Trans. Multim..

[22]  Rangasami L. Kashyap,et al.  Unsupervised video segmentation and object tracking , 2000 .

[23]  Dong-Seok Jeong,et al.  Detection of video scene breaks using directional informations in DCT domain , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[24]  Ramesh C. Jain,et al.  Knowledge-guided parsing in video databases , 1993, Electronic Imaging.