An Innovative Model of Tempo and Its Application in Action Scene Detection for Movie Analysis

In this paper, we present an innovative model of tempo and its application in action scene detection for movie analysis. For the first time, we clearly propose that tempo indicates the rhythm of both movie scenarios and human perception. By thoroughly analyzing both aspects, we classify the factors of tempo into two sorts. The first is based on the film grammar and we use the low level features of shot length and camera motion to describe filmmaking by directors. The second is based on the human perception and we originally propose the information measure for perception depending on the cognitive informatics, a newly emerging and significative subject. With the information in both visual and auditory modalities, the low level features of motion intensity, motion complexity, audio energy and audio pace are integrated for the formulation of information to describe the viewers' emotional changes to continuously developing storyline. With both aspects, tempo is defined and tempo flow plot is derived as the clue of storyline. On the basis of video structuralization and movie tempo analysis, we build a system for hierarchical browse and edit with action scene annotation. The large-scale experiments demonstrate the effectiveness and generality of tempo for action movie analysis.In this paper, we present an innovative model of tempo and its application in action scene detection for movie analysis. For the first time, we clearly propose that tempo indicates the rhythm of both movie scenarios and human perception. By thoroughly analyzing both aspects, we classify the factors of tempo into two sorts. The first is based on the film grammar and we use the low level features of Shot Length and Camera Motion to describe filmmaking by directors. The second is based on the human perception and we originally propose the information measure for perception depending on the cognitive informatics, a newly emerging and significative subject. With the information in both visual and auditory modalities, the low level features of Motion Intensity, Motion Complexity, Audio Energy and Audio Pace are integrated for the formulation of information to describe the viewers' emotional changes to continuously developing storyline. With both aspects, tempo is defined and tempo flow plot is derived as the clue of storyline. On the basis of video structuralization and movie tempo analysis, we build a system for hierarchical browse and edit with action scene annotation. The large-scale experiments demonstrate the effectiveness and generality of tempo for action movie analysis.

[1]  H. Wayne Schuth,et al.  Film Theory and Criticism: Introductory Readings , 1976 .

[2]  Lei Chen,et al.  Incorporating Audio Cues into Dialog and Action Scene Extraction , 2003, IS&T/SPIE Electronic Imaging.

[3]  Svetha Venkatesh,et al.  Novel approach to determining tempo and dramatic story sections in motion pictures , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[4]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[5]  Svetha Venkatesh,et al.  Role of shot length in characterizing tempo and dramatic story sections in motion pictures , 2000 .

[6]  Yasuyuki Matsushita,et al.  Space-Time Video Montage , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Svetha Venkatesh,et al.  Study of shot length and motion as contributing factors to movie tempo (poster session) , 2000, ACM Multimedia.

[8]  Guizhong Liu,et al.  A Multiple Visual Models Based Perceptive Analysis Framework for Multilevel Video Summarization , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Jeho Nam,et al.  Audio-visual content-based violent scene characterization , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[10]  Mubarak Shah,et al.  Detection and representation of scenes in videos , 2005, IEEE Transactions on Multimedia.

[11]  Bai Liang,et al.  Feature analysis and extraction for audio automatic classification , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Yingxu Wang,et al.  On Cognitive Informatics , 2002, Proceedings First IEEE International Conference on Cognitive Informatics.

[13]  Sheng Tang,et al.  TRECVID 2006 Rushes Exploitation by CAS MCG , 2006, TRECVID.

[14]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[15]  H. Wactlar,et al.  The Challenges of Continuous Capture , Contemporaneous Analysis , and Customized Summarization of Video Content , 2001 .

[16]  Alan F. Smeaton,et al.  Automatically selecting shots for action movie trailers , 2006, MIR '06.

[17]  Svetha Venkatesh,et al.  Toward automatic extraction of expressive elements from motion pictures: tempo , 2002, IEEE Trans. Multim..

[18]  D. Arijon,et al.  Grammar of Film Language , 1976 .

[19]  Wei-Ta Chu,et al.  Action movies segmentation and summarization based on tempo analysis , 2004, MIR '04.

[20]  Nuno Vasconcelos,et al.  Towards semantically meaningful feature spaces for the characterization of video content , 1997, Proceedings of International Conference on Image Processing.