Affective content analysis by mid-level representation in multiple modalities

Movie affective content detection attracts ever-increasing research efforts. However, the affective content analysis is still a challenging task due to the gap between low-level perceptual features and high-level human perception of the media. Moreover, clues from multiple modalities should be considered for affective analysis, since they were used in movies to represent emotions and romance emotional atmosphere. In this paper, mid-level representations are generated from low-level features. These mid-level representations are from multiple modalities and used for affective content inference. Besides video shots which is commonly used for video content analysis, audio sounds, dialogue and subtitle are explored to contribute to detect affective content. Since affective analysis rely on movie genres, experiments are implemented in respective genres. The results shows that audio sounds, dialogues and subtitles are effective and efficient for affective content detection.

[1]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[2]  Mohan S. Kankanhalli,et al.  Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[3]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[4]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[5]  Z. Kövecses,et al.  Metaphor and Emotion: Language, Culture, and Body in Human Feeling , 2000 .

[6]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[7]  Wei-Ta Chu,et al.  Movie emotional event detection based on music mood and video tempo , 2006, 2006 Digest of Technical Papers International Conference on Consumer Electronics.

[8]  Hang-Bong Kang,et al.  Affective content detection using HMMs , 2003, ACM Multimedia.

[9]  Affective content detection in sitcom using subtitle and audio , 2006, 2006 12th International Multi-Media Modelling Conference.

[10]  Qingming Huang,et al.  Using timing to detect horror shots in horror movies , 2007 .

[11]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[12]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.