Spatio-temporal Descriptor Using 3D Curvature Scale Space

This paper presents a novel technique to jointly represent the shape and motion of video objects for the purpose of content based video retrieval (CBVR). It enables to retrieve similar objects undergoing similar motion patterns, that are not captured only using motion trajectory or shape descriptors. In our approach, both shape and motion information are integrated in a unified spatio-temporal representation. Curvature scale space theory proposed by Mokhtarian is extended (in 3D) to represent shape as well as motion trajectory of video objects. A sequence of 2D contours are taken as input and convolved with a 2D Gaussian. The zero crossings are found out from the curvature of evolved surfaces, which form the 3D CSS surface. The peaks from the 3D CSS surface form the features for joint spatio-temporal representation of video objects. Experiments are carried out on CBVR and results show good performance of the algorithm.

[1]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[2]  David S. Doermann,et al.  Video retrieval using spatio-temporal descriptors , 2003, MULTIMEDIA '03.

[3]  James J. Little,et al.  Video retrieval by spatial and temporal structure of trajectories , 2001, IS&T/SPIE Electronic Imaging.

[4]  Ari Visa,et al.  Multiscale Fourier descriptor for shape-based image retrieval , 2004, ICPR 2004.

[5]  Forouzan Golshani,et al.  Motion recovery for video content classification , 1995, TOIS.

[6]  Ulrich Eckhardt,et al.  Shape descriptors for non-rigid shapes with a single closed contour , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[7]  Noel E. O'Connor,et al.  A multiscale representation method for nonrigid shapes with a single closed contour , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Miroslaw Bober,et al.  Curvature Scale Space Representation: Theory, Applications, and MPEG-7 Standardization , 2011, Computational Imaging and Vision.

[9]  Sotirios Chatzis,et al.  Video Representation and Retrieval Using Spatio-temporal Descriptors and Region Relations , 2006, ICANN.

[10]  Guojun Lu,et al.  Region-based shape representation and similarity measure suitable for content-based image retrieval , 1999, Multimedia Systems.

[11]  Jitendra Malik,et al.  Matching Shapes , 2001, ICCV.

[12]  Josef Kittler,et al.  Robust and Efficient Shape Indexing through Curvature Scale Space , 1996, BMVC.

[13]  Robert L. Ogniewicz,et al.  Skeleton-space: a multiscale shape description combining region and boundary information , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Michael H. F. Wilkinson,et al.  Shape representation and recognition through morphological curvature scale spaces , 2006, IEEE Transactions on Image Processing.

[15]  Rangasami L. Kashyap,et al.  Models for motion-based video indexing and retrieval , 2000, IEEE Trans. Image Process..

[16]  E. R. Davies,et al.  Machine vision - theory, algorithms, practicalities , 2004 .

[17]  B. S. Manjunath,et al.  NeTra-V: toward an object-based video representation , 1998, IEEE Trans. Circuits Syst. Video Technol..

[18]  B. S. Manjunath,et al.  NeTra-V: toward an object-based video representation , 1997, Electronic Imaging.