With the advancement of statistical machine learning, various machine learning methods have been applied to dynamic analysis of multimodal streams. However, previous studies have limitations for tackling various real-world streams because they focus on utilizing very limited characteristics of certain domains such as repetition of fixed frames. In this paper, we introduce a generative model-based segmenting method in which a story segment of a video stream is estimated through the likelihood of a given model to explain incoming data without requiring prior knowledge. There exists a profound question of how to compare each segment's latent structure parameters. In the proposed model, this difficulty is circumvented by computing likelihood of a new frame given a story model. We apply the proposed method to distinguishing several story segments in a TV drama episode. We employ LDA (Latent Dirichlet Allocation) framework for generating a story segment model. The proposed method is validated by comparing its results with those of human estimation.
[1]
Yee Whye Teh,et al.
An Introduction to Bayesian Nonparametric Modelling
,
2009
.
[2]
Sid-Ahmed Berrani,et al.
Automatic TV Broadcast Structuring
,
2010,
Int. J. Digit. Multim. Broadcast..
[3]
Jimeng Sun,et al.
Dynamic Mixture Models for Multiple Time-Series
,
2007,
IJCAI.
[4]
Andrew McCallum,et al.
Topics over time: a non-Markov continuous-time model of topical trends
,
2006,
KDD '06.
[5]
Michael I. Jordan,et al.
Hierarchical Bayesian Nonparametric Models with Applications
,
2008
.
[6]
G. Laurent,et al.
Sequence coding and learning
,
2010
.
[7]
Jean-Philippe Poli,et al.
An automatic television stream structuring system for television archives holders
,
2008,
Multimedia Systems.
[8]
Michael I. Jordan,et al.
Latent Dirichlet Allocation
,
2001,
J. Mach. Learn. Res..