A cinematic-based framework for scene boundary detection in video

Most current video retrieval systems use shots as the basis for information organization and access. In cinematography, scene is the basic story unit that the directors use to compose and convey their ideas. This paper proposes a framework based on the concept of continuity to analyze video contents and extract scene boundaries. Starting from a set of shots, the framework successively applies the concept of visual, position, camera focal distance, motion, audio and semantic continuity to group shots that exhibit some form of continuity into scenes. The framework helps to explain the principles and the heuristics behind most cinematic rules. The idea is tested using the first three levels of continuity to extract the scenes defined using the most common cinematic rules. The method has been found to be effective.

[1]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[2]  Tat-Seng Chua,et al.  A video retrieval and sequencing system , 1995, TOIS.

[3]  Tat-Seng Chua,et al.  CINEMATIC-BASED MODEL FOR SCENE BOUNDARY DETECTION , 2001 .

[4]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[5]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[6]  Masahito Hirakawa,et al.  Content-based retrieval of video data by the grammar of film , 1997, Proceedings. 1997 IEEE Symposium on Visual Languages (Cat. No.97TB100180).

[7]  Suh-Yin Lee,et al.  Content-based video retrieval based on similarity of frame sequence , 1998, Proceedings International Workshop on Multi-Media Database Management Systems (Cat. No.98TB100249).

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Glorianna Davenport,et al.  Cinematic primitives for multimedia , 1991, IEEE Computer Graphics and Applications.

[10]  Osamu Hori,et al.  A shot classification method of selecting effective key-frames for video browsing , 1997, MULTIMEDIA '96.

[11]  Thomas S. Huang,et al.  Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[12]  Wei Xiong,et al.  Query by video clip , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[13]  Roy Thompson,et al.  Grammar of the Shot , 1998 .

[14]  S. Eisenstein,et al.  The Film Sense , 1942 .

[15]  Minerva M. Yeung,et al.  Efficient matching and clustering of video shots , 1995, Proceedings., International Conference on Image Processing.

[16]  Shih-Fu Chang,et al.  Determining computable scenes in films and their structures using audio-visual memory models , 2000, ACM Multimedia.

[17]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[18]  Frank Eugene Beaver,et al.  Dictionary of film terms , 1983 .

[19]  Tsukasa Noma,et al.  Automating virtual camera control for computer animation , 1992 .

[20]  Mohan S. Kankanhalli,et al.  A GENERAL FRAMEWORK FOR VIDEO SEGMENTATION BASED ON TEMPORAL MULTI-RESOLUTION ANALYSIS , 2000 .

[21]  Remi Depommier,et al.  Content-based browsing of video sequences , 1994, MULTIMEDIA '94.

[22]  John R. Kender,et al.  Video scene segmentation via continuous video coherence , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[23]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[24]  Tat-Seng Chua,et al.  Detection of human faces in a compressed domain for video stratification , 2002, The Visual Computer.

[25]  D. Arijon,et al.  Grammar of Film Language , 1976 .

[26]  Tat-Seng Chua,et al.  A match and tiling approach to content-based video retrieval , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..