Movie genre classification via scene categorization

This paper presents a method for movie genre categorization of movie trailers, based on scene categorization. We view our approach as a step forward from using only low-level visual feature cues, towards the eventual goal of high-level seman- tic understanding of feature films. Our approach decom- poses each trailer into a collection of keyframes through shot boundary analysis. From these keyframes, we use state-of- the-art scene detectors and descriptors to extract features, which are then used for shot categorization via unsuper- vised learning. This allows us to represent trailers using a bag-of-visual-words (bovw) model with shot classes as vo- cabularies. We approach the genre classification task by mapping bovw temporally structured trailer features to four high-level movie genres: action, comedy, drama or horror films. We have conducted experiments on 1239 annotated trailers. Our experimental results demonstrate that exploit- ing scene structures improves film genre classification com- pared to using only low-level visual features.

[1]  John S. D. Mason,et al.  Classification of video genre using audio , 2001, INTERSPEECH.

[2]  Wen-Hsing Hsu,et al.  A Film Classifier Based on Low-level Visual Features , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[3]  Wolfgang Effelsberg,et al.  Automatic recognition of film genres , 1995, MULTIMEDIA '95.

[4]  Mubarak Shah,et al.  Movie genre classification by exploiting audio-visual features of previews , 2002, Object recognition supported by user interaction for service robots.

[5]  Baoxin Li,et al.  YouTubeCat: Learning to categorize wild web videos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Yaser Sheikh,et al.  On the use of computable features for film classification , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  James M. Rehg,et al.  Where am I: Place instance and category recognition using spatial PACT , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  Koen E. A. van de Sande,et al.  Evaluation of color descriptors for object and scene recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Robert M. Bell,et al.  The BellKor 2008 Solution to the Netflix Prize , 2008 .

[11]  Diane J. Cook,et al.  Using Closed Captions and Visual Features to Classify Movies by Genre , 2006 .

[12]  Rahul Malik,et al.  VideoMule: a consensus learning approach to multi-label classification from noisy user-generated videos , 2009, MM '09.