MST-CSS (Multi-Spectro-Temporal Curvature Scale Space), a Novel Spatio-Temporal Representation for Content-Based Video Retrieval

We present a novel spatio-temporal descriptor to efficiently represent a video object for the purpose of content-based video retrieval. Features from spatial along with temporal information are integrated in a unified framework for the purpose of retrieval of similar video shots. A sequence of orthogonal processing, using a pair of 1-D multiscale and multispectral filters, on the space-time volume (STV) of a video object (VOB) produces a gradually evolving (smoother) surface. Zero-crossing contours (2-D) computed using the mean curvature on this evolving surface are stacked in layers to yield a hilly (3-D) surface, for a joint multispectro-temporal curvature scale space (MST-CSS) representation of the video object. Peaks and valleys (saddle points) are detected on the MST-CSS surface for feature representation and matching. Computation of the cost function for matching a query video shot with a model involves matching a pair of 3-D point sets, with their attributes (local curvature), and 3-D orientations of the finally smoothed STV surfaces. Experiments have been performed with simulated and real-world video shots using precision-recall metric for our performance study. The system is compared with a few state-of-the-art methods, which use shape and motion trajectory for VOB representation. Our unified approach has shown better performance than other approaches that use combined match-costs obtained with separate shape and motion trajectory representations and our previous work on a simple joint spatio-temporal descriptor (3-D-CSS).

[1]  Sukhendu Das,et al.  Trajectory representation using Gabor features for motion-based video retrieval , 2009, Pattern Recognit. Lett..

[2]  Dennis Gabor,et al.  Theory of communication , 1946 .

[3]  Masahito Hirakawa,et al.  VIOLONE: Video Retrieval by Motion Example , 1996, J. Vis. Lang. Comput..

[4]  Josef Kittler,et al.  Robust and Efficient Shape Indexing through Curvature Scale Space , 1996, BMVC.

[5]  Ghassan Hamarneh,et al.  Deformable Spatio-Temporal Shape Models: Extending ASM to 2D+Time , 2001, BMVC.

[6]  Cedric Nishan Canagarajah,et al.  Object based video retrieval with local region tracking , 2007, Signal Process. Image Commun..

[7]  Sotirios Chatzis,et al.  Video Representation and Retrieval Using Spatio-temporal Descriptors and Region Relations , 2006, ICANN.

[8]  Marcus Jerome Pickering,et al.  Evaluation of key frame-based retrieval techniques for video , 2003, Comput. Vis. Image Underst..

[9]  Yves Jean,et al.  Instantly indexed multimedia databases of real world events , 2002, IEEE Trans. Multim..

[10]  Jae-Woo Chang,et al.  Efficient Similar Trajectory-Based Retrieval for Moving Objects in Video Databases , 2003, CIVR.

[11]  Rangasami L. Kashyap,et al.  Models for motion-based video indexing and retrieval , 2000, IEEE Trans. Image Process..

[12]  Monique Thonnat,et al.  Subtrajectory-Based Video Indexing and Retrieval , 2007, MMM.

[13]  Mubarak Shah,et al.  Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  R. Venkatesh Babu,et al.  Compressed domain video retrieval using object and global motion descriptors , 2006, Multimedia Tools and Applications.

[15]  Guojun Lu,et al.  Evaluation of MPEG-7 shape descriptors against other shape descriptors , 2003, Multimedia Systems.

[16]  Ari Visa,et al.  Multiscale Fourier descriptor for shape-based image retrieval , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  Jitendra Malik,et al.  Matching Shapes , 2001, ICCV.

[18]  Yo-Sung Ho,et al.  Active camera tracking system using affine motion compensation , 2003, Visual Communications and Image Processing.

[19]  Sukhendu Das,et al.  Spatio-temporal Descriptor Using 3D Curvature Scale Space , 2007, PReMI.

[20]  Hayit Greenspan,et al.  Probabilistic space-time video modeling via piecewise GMM , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Mubarak Shah,et al.  Content based video matching using spatiotemporal volumes , 2008, Comput. Vis. Image Underst..

[22]  Ken Chen,et al.  Hessian matrix based saddle point detection for granules segmentation in 2D image , 2008 .

[23]  Miroslaw Bober,et al.  Curvature Scale Space Representation: Theory, Applications, and MPEG-7 Standardization , 2011, Computational Imaging and Vision.

[24]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[25]  David S. Doermann,et al.  Video retrieval of near-duplicates using κ-nearest neighbor retrieval of spatio-temporal descriptors , 2006, Multimedia Tools and Applications.

[26]  Shih-Fu Chang,et al.  VideoQ: an automated content based video search system using visual cues , 1997, MULTIMEDIA '97.

[27]  Dan Schonfeld,et al.  Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences , 2007, IEEE Transactions on Multimedia.

[28]  Trevor Johnston,et al.  Australian Sign Language (Auslan): Auslan and other signed languages , 2007 .

[29]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[30]  Wayne H. Wolf,et al.  A real-time background subtraction method with camera motion compensation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[31]  Thomas Sikora,et al.  The MPEG-7 visual standard for content description-an overview , 2001, IEEE Trans. Circuits Syst. Video Technol..

[32]  Forouzan Golshani,et al.  Motion recovery for video content classification , 1995, TOIS.

[33]  Robert L. Ogniewicz,et al.  Skeleton-space: a multiscale shape description combining region and boundary information , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Eli Shechtman,et al.  Space-Time Behavior-Based Correlation-OR-How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them? , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Xin Li,et al.  Contour-based object tracking with occlusion handling in video acquired using mobile cameras , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Jun-Wei Hsieh,et al.  Motion-based video retrieval by trajectory matching , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Andrea Cavallaro,et al.  Multifeature Object Trajectory Clustering for Video Analysis , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Conglin Lu,et al.  Surface Evolution under Curvature Flows , 2002, J. Vis. Commun. Image Represent..

[39]  David S. Doermann,et al.  Video retrieval using spatio-temporal descriptors , 2003, MULTIMEDIA '03.

[40]  Minh-Son Dao,et al.  Video retrieval using video object-trajectory and edge potential function , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[41]  B. S. Manjunath,et al.  NeTra-V: toward an object-based video representation , 1997, Electronic Imaging.

[42]  Daniel DeMenthon,et al.  A Survey of Spatio-Temporal Grouping Techniques , 2002 .

[43]  Guoliang Fan,et al.  Joint Key-Frame Extraction and Object Segmentation for Content-Based Video Analysis , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[45]  Ahmed K. Elmagarmid,et al.  InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval , 2005, IEEE Transactions on Multimedia.

[46]  Jeffrey A. Bloom,et al.  An uncertainty analysis of some real functions for image processing applications , 1997, Proceedings of International Conference on Image Processing.

[47]  Michael Stonebraker,et al.  Chabot: Retrieval from a Relational Database of Images , 1995, Computer.

[48]  Edoardo Ardizzone,et al.  Automatic Video Database Indexing and Retrieval , 2004, Multimedia Tools and Applications.

[49]  Kuo-Chin Fan,et al.  Motion Flow-Based Video Retrieval , 2007, IEEE Transactions on Multimedia.

[50]  Avideh Zakhor,et al.  Motion indexing of video , 1997, Proceedings of International Conference on Image Processing.

[51]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[52]  Wolfgang Effelsberg,et al.  VisualGREP: A Systematic Method to Compare and Retrieve Video Sequences , 2004, Multimedia Tools and Applications.

[53]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[54]  Anil K. Jain,et al.  A Real-Time Matching System for Large Fingerprint Databases , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Yo-Sung Ho,et al.  Content-based event retrieval using semantic scene interpretation for automated traffic surveillance , 2001, IEEE Trans. Intell. Transp. Syst..

[56]  Maurice Milgram,et al.  Recognition of human behavior by space-time silhouette characterization , 2008, Pattern Recognit. Lett..

[57]  Maurice Milgram,et al.  A novel approach for recognition of human actions with semi-global features , 2008, Machine Vision and Applications.

[58]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[59]  Dan Schonfeld,et al.  A hybrid system for affine-invariant trajectory retrieval , 2004, MIR '04.

[60]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[61]  Janusz Konrad,et al.  Space-time image sequence analysis: object tunnels and occlusion volumes , 2006, IEEE Transactions on Image Processing.

[62]  Maneesh Kumar Singh,et al.  State-of-the-art on spatio-temporal information-based video retrieval , 2009, Pattern Recognit..

[63]  Arbee L. P. Chen,et al.  Video retrieval based on video motion tracks of moving objects , 2003, IS&T/SPIE Electronic Imaging.

[64]  Patrick Bouthemy,et al.  Real-Time Tracking of Moving Persons by Exploiting Spatio-Temporal Image Slices , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Ajay Divakaran,et al.  MPEG-7 visual motion descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[66]  Wolfgang Effelsberg,et al.  VisualGREP: a systematic method to compare and retrieve video sequences , 1997, Electronic Imaging.