Movie Content Retrieval and Semi-automatic Annotation Based on Low-Level Descriptions

In this paper, we present a semantic retrieval and semi-automatic annotation system for movies, based on the regional features of video images. The system uses a 5-dimensional GBD-tree structure to organize the low-level features: the color, area, and minimal bounding rectangle coordinates of each region that is a segment of a key frame. We propose a regionally based "semantic" object retrieval method that compares color, area, and spatial relationships between selected regions to distinguish them from background information. Using this method, movie information can be retrieved for video data containing the same objects based upon object semantics. In addition, a semi-automatic annotation method is proposed for annotating the matched "semantic" objects for further use. A retrieval system has been implemented that includes semantic retrieval and semi-automatic annotation functions.

[1]  Yoshitomo Yaginuma,et al.  Proposal of real world video stream description language (VSDL-RW) and its application , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[2]  B. S. Manjunath,et al.  Peer group filtering and perceptual color image quantization , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[3]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[4]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[5]  Masao Sakauchi,et al.  A new tree type data structure with homogeneous nodes suitable for a very large spatial database , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[6]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[7]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[8]  Dorin Comaniciu,et al.  Robust analysis of feature spaces: color image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  B. S. Manjunath,et al.  Color image segmentation , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[10]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[11]  A. Guttman,et al.  A Dynamic Index Structure for Spatial Searching , 1984, SIGMOD 1984.

[12]  Amarnath Gupta,et al.  Visual information retrieval , 1997, CACM.

[13]  Masao Sakauchi,et al.  A video movie annotation system-annotation movie with its script , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[14]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[15]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[16]  Yoshitomo Yaginuma,et al.  Partial Image Retrieval Using Color Regions and Spatial Relationships , 2002, VDB.