A probabilistic framework for semantic indexing and retrieval in video

This paper proposes a novel probabilistic framework for semantic indexing and retrieval in digital video. The components of the framework are multijects and multinets. Multijects are probabilistic multimedia objects (Naphade et al., 1998) representing semantic features or concepts. A multinet is a probabilistic network of multijects which accounts for the interaction between concepts. The main contribution of this paper is a Bayesian multinet which enhances the detection probability of individual multijects, provides a unified framework for integrating multiple modalities and supports inference of unobservable concepts based on their relation with observable concepts. We develop multijects for detecting sites (locations) in video and integrate the multijects using a multinet in the form of a Bayesian network. Experiments reveal significant performance improvement using the multinet.

[1]  Robert B. McGhee,et al.  Aircraft Identification by Moment Invariants , 1977, IEEE Transactions on Computers.

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Rangachar Kasturi,et al.  Machine vision , 1995 .

[4]  Yücel Altunbasak,et al.  Content-based video retrieval and compression: a unified solution , 1997, Proceedings of International Conference on Image Processing.

[5]  Shih-Fu Chang,et al.  Spatio-temporal video search using the object based video representation , 1997, Proceedings of International Conference on Image Processing.

[6]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[7]  Nuno Vasconcelos,et al.  Bayesian modeling of video editing and structure: semantic features for video summarization and browsing , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8]  A. Murat Tekalp,et al.  A high-performance shot boundary detection algorithm using multiple cues , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[9]  Shih-Fu Chang,et al.  Semantic visual templates: linking visual features to semantics , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[10]  Anil K. Jain,et al.  Shape-Based Retrieval: A Case Study With Trademark Image Databases , 1998, Pattern Recognit..

[11]  W. Eric L. Grimson,et al.  A framework for learning query concepts in image classification , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[12]  M. Ibrahim Sezan,et al.  A computational approach to semantic event detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[13]  Milind R. Naphade,et al.  Stochastic modeling of soundtrack for efficient segmentation and indexing of video , 1999, Electronic Imaging.

[14]  A. Murat Tekalp,et al.  Probabilistic Analysis and Extraction of Video Content , 1999, ICIP.