Semantic video indexing using a probabilistic framework

Proposes a probabilistic framework for semantic video indexing. The components of the framework are multijects and multinets. Multijects are probabilistic multimedia objects representing semantic features or concepts. A multinet is a probabilistic network of multijects which accounts for the interaction between concepts. The main contribution of the paper is the application of a graphical probabilistic framework to build the multinet. The multinet enhances the detection performance of individual multijects, provides a unified framework for integrating multiple modalities and supports inference of unobservable concepts based on their relation with observable concepts. We develop multijects for detecting sites (locations) in video and integrate the multijects using multinet in the form of a Bayesian network. Detection performance is significantly improved using the multinet.

[1]  M. Ibrahim Sezan,et al.  A computational approach to semantic event detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  Nuno Vasconcelos,et al.  Bayesian modeling of video editing and structure: semantic features for video summarization and browsing , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[3]  W. Eric L. Grimson,et al.  A framework for learning query concepts in image classification , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[4]  Yücel Altunbasak,et al.  Content-based video retrieval and compression: a unified solution , 1997, Proceedings of International Conference on Image Processing.

[5]  Milind R. Naphade,et al.  Stochastic modeling of soundtrack for efficient segmentation and indexing of video , 1999, Electronic Imaging.

[6]  Shih-Fu Chang,et al.  Spatio-temporal video search using the object based video representation , 1997, Proceedings of International Conference on Image Processing.

[7]  A. Murat Tekalp,et al.  A high-performance shot boundary detection algorithm using multiple cues , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8]  Rangachar Kasturi,et al.  Machine vision , 1995 .

[9]  Anil K. Jain,et al.  Shape-Based Retrieval: A Case Study With Trademark Image Databases , 1998, Pattern Recognit..

[10]  Shih-Fu Chang,et al.  Semantic visual templates: linking visual features to semantics , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[11]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[12]  Robert B. McGhee,et al.  Aircraft Identification by Moment Invariants , 1977, IEEE Transactions on Computers.

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14]  A. Murat Tekalp,et al.  Probabilistic Analysis and Extraction of Video Content , 1999, ICIP.