Dialogue Sequence Detection in Movies

Dialogue sequences constitute an important part of any movie or television program and their successful detection is an essential step in any movie summarisation/indexing system. The focus of this paper is to detect sequences of dialogue, rather than complete scenes. We argue that these shorter sequences are more desirable as retrieval units than temporally long scenes. This paper combines various audiovisual features that reflect accepted and well know film making conventions using a selection of machine learning techniques in order to detect such sequences. Three systems for detecting dialogue sequences are proposed: one based primarily on audio analysis, one based primarily on visual analysis and one that combines the results of both. The performance of the three systems are compared using a manually marked-up test corpus drawn from a variety of movies of different genres. Results show that high precision and recall can be obtained using low-level features that are automatically extracted.

[1]  Wei-Ta Chu,et al.  Action movies segmentation and summarization based on tempo analysis , 2004, MIR '04.

[2]  Zhu Liu,et al.  Integration of audio and visual information for content-based video segmentation , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[3]  John R. Kender,et al.  Video scene segmentation via continuous video coherence , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[4]  Thomas S. Huang,et al.  Image processing , 1971 .

[5]  Noel E. O'Connor,et al.  Action Sequence Detection in Motion Pictures , 2004, EWIMT.

[6]  Noel E. O'Connor,et al.  Evaluating and combining digital video shot boundary detection algorithms , 2000 .

[7]  C.-C. Jay Kuo,et al.  Video Content Analysis Using Multimodal Information , 2003, Springer US.

[8]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[9]  David Bordwell,et al.  Film Art: An Introduction , 1979 .

[10]  Lei Chen,et al.  Incorporating Audio Cues into Dialog and Action Scene Extraction , 2003, IS&T/SPIE Electronic Imaging.

[11]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[12]  Mubarak Shah,et al.  A Framework for Semantic Classification of Scenes Using Finite State Machines , 2004, CIVR.

[13]  Noel E. O'Connor,et al.  Dialogue scene detection in movies using low and mid-level visual features , 2004 .

[14]  Shih-Fu Chang,et al.  Condensing computable scenes using visual complexity and film syntax analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..