Discovering recurrent events in video using unsupervised methods

Production videos such as news, sports and movies have a definitive structure that involves short term interaction as well as long term correlation. This structure in video can be captured by models that take into consideration the short term statistics as well as long term recurrence. We investigate the application of probabilistic models that capture this structure. The novel approach is to characterize the short term events in video by models that can account for temporal support in terms of piecewise stationary signals with transitions, These short term events can then be embedded within another temporal model that accounts for transitions between these event and thus characterizes long term history. This also leads to the detection of recurring events in video using a monolithic model. The proposed approach is an unsupervised algorithm for event detection and it can be used for summarization, similarity based matching and enhanced browsing.

[1]  John R. Kender,et al.  Video scene segmentation via continuous video coherence , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[2]  Takeo Kanade,et al.  Semantic analysis for video contents extraction—spotting by association in news video , 1997, MULTIMEDIA '97.

[3]  Zhu Liu,et al.  Classification TV programs based on audio information using hidden Markov model , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[4]  A. Murat Tekalp,et al.  A high-performance shot boundary detection algorithm using multiple cues , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Stephen E. Levinson,et al.  Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition , 1989, HLT.

[7]  Jean-Luc Gauvain,et al.  Large vocabulary speech recognition in French , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  Jeho Nam,et al.  Speaker identification and video analysis for hierarchical video shot classification , 1997, Proceedings of International Conference on Image Processing.

[9]  Svetha Venkatesh,et al.  Automated film rhythm extraction for scene analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[10]  Chao Lu,et al.  A Time-Recursive Algorithm for the Computation of Auto-Ambiguity Function and the Wigner Distribution , 1998, Multidimens. Syst. Signal Process..

[11]  Wayne H. Wolf,et al.  Hidden Markov model parsing of video programs , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Milind R. Naphade,et al.  Supporting audiovisual query using dynamic programming , 2001, MULTIMEDIA '01.

[13]  A. B. Poritz,et al.  Linear predictive hidden Markov models and the speech signal , 1982, ICASSP.

[14]  Shih-Fu Chang,et al.  Video scene segmentation using video and audio features , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[15]  A. Murat Tekalp,et al.  Probabilistic Analysis and Extraction of Video Content , 1999, ICIP.

[16]  Tiecheng Liu,et al.  A hidden Markov model approach to the structure of documentaries , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.