Most current work on video indexing concentrates on queries which operate over high level semantic information which must be entirely composed and entered manually. We propose an indexing system which is based on spatial information about key objects in a scene. These key objects may be detected automatically, with manual supervision, and tracked through a sequence using one of a number of recently developed techniques. This representation is highly compact and allows rapid resolution of queries specified by iconic example. A number of systems have been produced which use 2D string notations to index digital image libraries. Just as 2D strings provide a compact and tractable indexing notation for digital pictures, a sequence of 2D strings might provide an index for a video or image sequence. To improve further upon this we reduce the representation to the 2D string pair representing the initial frame, and a sequence of edits to these strings. This takes advantage of the continuity between frames to further reduce the size of the notation. By representing video sequences using string edits, a notation has been developed which is compact, and allows querying on the spatial relationships of objects to be performed without rebuilding the majority of the scene. Calculating ranks of objects directly from the edit sequence allows matching with minimal calculation, thus greatly reducing search time. This paper presents the edit sequence notation and algorithms for evaluating queries over image sequences. A number of optimizations which represent a considerably saving in search time is demonstrated in the paper.
[1]
Anil S. Chakravarthy,et al.
Toward Semantic Retrieval of Pictures and Video
,
1994,
RIAO.
[2]
Shi-Kuo Chang,et al.
Iconic Indexing by 2-D Strings
,
1987,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3]
Shi-Kuo Chang,et al.
An Intelligent Image Database System
,
1988,
IEEE Trans. Software Eng..
[4]
Shi-Kuo Chang,et al.
Representation And Retrieval Of Symbolic Pictures Using Generalized 2D Strings
,
1989,
Other Conferences.
[5]
James F. Allen.
Maintaining knowledge about temporal intervals
,
1983,
CACM.
[6]
SUH-YIN LEE,et al.
Spatial reasoning and similarity retrieval of images using 2D C-string knowledge representation
,
1992,
Pattern Recognit..
[7]
Shi-Kuo Chang,et al.
Image sequence compression by iconic indexing
,
1989,
[Proceedings] 1989 IEEE Workshop on Visual Languages.
[8]
Hideo Hashimoto,et al.
Video indexing using motion vectors
,
1992,
Other Conferences.
[9]
Marc Davis,et al.
Knowledge Representation for Video
,
1994,
AAAI.
[10]
Yoshinobu Tonomura,et al.
Video browsing using brightness data
,
1991,
Other Conferences.