Shot Classification and Scene Segmentation Based on MPEG Compressed Movie Analysis

This paper proposes shot classification and scene boundary/genre identification for MPEG compressed movies. Through statistical analysis of audio-visual features on compressed domain, the proposed method achieves subjectively accurate shot classification within the movies into a predefined genre set, as well as scene segmentation based on the shot classification results. By feeding subjectively evaluated feature vectors for each genre into the decision tree classifier, each shot is classified at very low computational cost. Then a sequence of shots belonging to the same genre is determined as a scene. The experimental results show that most of the shots in the movies are classified into subjectively accurate genres, and also that the scene segmentation results are more accurate and robust than the conventional approach.

[1]  Wolfgang Effelsberg,et al.  Automatic recognition of film genres , 1995, MULTIMEDIA '95.

[2]  Mubarak Shah,et al.  Movie genre classification by exploiting audio-visual features of previews , 2002, Object recognition supported by user interaction for service robots.

[3]  Masaru Sugano,et al.  Shot genre classification using compressed audio-visual features , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[4]  Ba Tu Truong,et al.  Automatic genre identification for content-based video categorization , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[5]  Lei Chen,et al.  Rule-based scene extraction from video , 2002, Proceedings. International Conference on Image Processing.

[6]  Nuno Vasconcelos,et al.  Towards semantically meaningful feature spaces for the characterization of video content , 1997, Proceedings of International Conference on Image Processing.

[7]  Yang Lu,et al.  A fast audio classification from MPEG coded data , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  P. Beek,et al.  Text of 15938-5 FCD Information Technology-Multimedia Content Description Interface-Pard 5 Multimedia Description Schemes , 2001 .

[9]  Wolfgang Effelsberg,et al.  Scene Determination Based on Video and Audio Features , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[10]  Carla E. Brodley,et al.  Linear Machine Decision Trees , 1991 .

[11]  Zhu Liu,et al.  Classification TV programs based on audio information using hidden Markov model , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[12]  D. Arijon,et al.  Grammar of Film Language , 1976 .

[13]  Alan Hanjalic,et al.  Automatically Segmenting Movies into Logical Story Units , 1999, VISUAL.

[14]  Louis Vuurpijl,et al.  Using Pen-Based Outlines for Object-Based Annotation and Image-Based Queries , 1999, VISUAL.

[15]  George Tzanetakis,et al.  Sound analysis using MPEG compressed audio , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Akio Yoneyama,et al.  Universal scene change detection on MPEG-coded data domain , 1997, Electronic Imaging.

[17]  Thomas S. Huang,et al.  Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).