Coherent segmentation of video into syntactic regions

In this paper we report on our work in realising an approach to video shot matching which involves automatically segmenting video into abstract intertwinded shapes in such a way that there is temporal coherency. These shapes representing approximations of objects and background regions can then be matched giving fine-grained shot-shot matching. The main contributions of the paper are firstly the extension of our segmentation algorithm for still images to spatial segmentation in video, and secondly the introduction a measurement of temporal coherency of the spatial segmentation. This latter allows us to quantitatively demonstrate the effectiveness of our approach on real video data.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Alberto Del Bimbo,et al.  Image Retrieval by Elastic Matching of User Sketches , 1995, ICIAP.

[3]  Josep R. Casas,et al.  Binary-partition-tree creation using a quasi-inclusion criterion , 2004, Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004..

[4]  Mika Rautiainen,et al.  Temporal color correlograms for video retrieval , 2002, Object recognition supported by user interaction for service robots.

[5]  C. F. Bennstrom,et al.  Binary-partition-tree creation using a quasi-inclusion criterion , 2004 .

[6]  Paul Over,et al.  The TREC VIdeo Retrieval Evaluation (TRECVID): A Case Study and Status Report , 2004, RIAO.

[7]  A. Hampapur,et al.  Smart video surveillance: exploring the concept of multiscale spatiotemporal tracking , 2005, IEEE Signal Processing Magazine.

[8]  Amir Averbuch,et al.  A region-based MRF model for unsupervised segmentation of moving objects in image sequences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Noel E. O'Connor,et al.  Region-based segmentation of images using syntactic visual features , 2005 .

[10]  Alan F. Smeaton,et al.  Associating low-level features with semantic concepts using video objects and relevance feedback , 2005 .

[11]  Myungcheol Lee,et al.  Graph theory for image analysis: an approach based on the shortest spanning tree , 1986 .

[12]  Alan F. Smeaton,et al.  Experiences of creating four video library collections with the Físchlár system , 2004, International Journal on Digital Libraries.