Probabilistic Analysis and Extraction of Video Content

In this paper we present a probabilistic framework for mapping low-level visual features into a specific set of semantic descriptors. Specifically, we employ hidden Markov models (HMMs) and Bayesian belief networks (BBNs) at various stages to characterize content domains and extract the relevant semantic information. HMMs are utilized at the shot and sequence levels to model the sequentially-varying structure of video sequences and delineate the video stream in terms of the constituent shots. BBNs, on the-other hand, act on and within each shot, to provide more detailed descriptions of shot content using the physical features of video objects. The semantic content extraction problem is thus addressed at all physical (shot and object) levels, within a consistent representation and processing framework.

[1]  A. Murat Tekalp,et al.  Occlusion adaptive motion snake , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Nuno Vasconcelos,et al.  Bayesian modeling of video editing and structure: semantic features for video summarization and browsing , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[3]  B. Kawin,et al.  How Movies Work , 1987 .

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Stefan Eickeler,et al.  Content-based video indexing of TV broadcast news using hidden Markov models , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[7]  Wayne H. Wolf,et al.  Hidden Markov model parsing of video programs , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  A. Murat Tekalp,et al.  Efficient Filtering and Clustering Methods for Temporal Video Segmentation and Visual Summarization , 1998, J. Vis. Commun. Image Represent..

[9]  A. Murat Tekalp,et al.  Effective content representation for video , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[10]  John S. Boreczky,et al.  A hidden Markov model framework for video segmentation using audio and image features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).