Content-based video retrieval by integrating spatio-temporal and stochastic recognition of events

As amounts of publicly available video data grow the need to query this data efficiently becomes significant. Consequently content-based retrieval of video data turns out to be a challenging and important problem. We address the specific aspect of inferring semantics automatically from raw video data. In particular, we introduce a new video data model that supports the integrated use of two different approaches for mapping low-level features to high-level concepts. Firstly, the model is extended with a rule-based approach that supports spatio-temporal formalization of high-level concepts, and then with a stochastic approach. Furthermore, results on real tennis video data are presented, demonstrating the validity of both approaches, as well us advantages of their integrated use.

[1]  Nuno Vasconcelos,et al.  Bayesian modeling of video editing and structure: semantic features for video summarization and browsing , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Arif Ghafoor,et al.  Semantic Modeling and Knowledge Representation in Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..

[3]  Milind R. Naphade,et al.  A probabilistic framework for semantic indexing and retrieval in video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[4]  Aaron F. Bobick,et al.  Visual Tracking Using Closed-Worlds , 1995 .

[5]  Martin L. Kersten,et al.  Flattening an object algebra to provide performance , 1998, Proceedings 14th International Conference on Data Engineering.

[6]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Yves Jean,et al.  LucentVision: converting real world events into multimedia experiences , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[8]  A. Murat Tekalp,et al.  Probabilistic Analysis and Extraction of Video Content , 1999, ICIP.

[9]  Mubarak Shah,et al.  Motion-Based Recognition , 1997, Computational Imaging and Vision.

[10]  Mohan S. Kankanhalli,et al.  Video Modeling Using Strata-Based Annotation , 2000, IEEE Multim..

[11]  Tanveer F. Syeda-Mahmood,et al.  Detecting topical events in digital video , 2000, ACM Multimedia.

[12]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[13]  M. Ibrahim Sezan,et al.  A semantic event-detection approach and its application to detecting hunts in wildlife vide , 2000, IEEE Trans. Circuits Syst. Video Technol..

[14]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[15]  Hisashi Miyamori,et al.  Video annotation for content-based retrieval using human behavior analysis and domain knowledge , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[16]  Hironobu Fujiyoshi,et al.  Real-time human motion analysis by image skeletonization , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[17]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[18]  Irfan A. Essa,et al.  Exploiting human actions and object context for recognition tasks , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[19]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[21]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[22]  TheodoridisYannis,et al.  Topological relations in the world of minimum bounding rectangles , 1995 .

[23]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[24]  Milan Petkovic,et al.  Image Segmentation and Feature Extraction for Recognizing Strokes in Tennis Game Videos , 2001 .

[25]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[26]  Anil K. Jain,et al.  Automatic classification of tennis video for high-level content-based retrieval , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[27]  Takeo Kanade,et al.  A System for Video Surveillance and Monitoring , 2000 .

[28]  HongJiang Zhang,et al.  Automatic parsing of TV soccer programs , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[29]  Gerhard Rigoll,et al.  Pseudo 3-D HMMs for image sequence recognition , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[30]  Atsuo Yoshitaka,et al.  A Survey on Content-Based Retrieval for Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..