Extracting moods from pictures and sounds: towards truly personalized TV

This paper considers how we feel about the content we see or hear. As opposed to the cognitive content information composed of the facts about the genre, temporal content structures and spatiotemporal content elements, we are interested in obtaining the information about the feelings, emotions, and moods evoked by a speech, audio, or video clip. We refer to the latter as the affective content, and to the terms such as happy or exciting as the affective labels of an audiovisual signal. In the first part of the paper, we explore the possibilities for representing and modeling the affective content of an audiovisual signal to effectively bridge the affective gap. Without loosing generality, we refer to this signal simply as video, which we see as an image sequence with an accompanying soundtrack. Then, we show the high potential of the affective video content analysis for enhancing the content recommendation functionalities of the future PVRs and VOD systems. We conclude this paper by outlining some interesting research challenges in the field

[1]  H. Schlosberg Three dimensions of emotion. , 1954, Psychological review.

[2]  J. M. Kittross The measurement of meaning , 1959 .

[3]  J. Davitz,et al.  The communication of emotional meaning , 1964 .

[4]  K. Stevens,et al.  Emotions and speech: some acoustical correlates. , 1972, The Journal of the Acoustical Society of America.

[5]  J. Russell,et al.  Evidence for a three-factor theory of emotions , 1977 .

[6]  David Bordwell,et al.  Film Art: An Introduction , 1979 .

[7]  P. Lang,et al.  Affective judgment and psychophysiological response: Dimensional covariation in the evaluation of pictorial stimuli. , 1989 .

[8]  Jeff Pittam,et al.  The long-term spectrum and perceived emotion , 1990, Speech Commun..

[9]  P. Lang The network model of emotion: motivational connections , 1993 .

[10]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[11]  M. Oliver Contributions of sexual portrayals to viewers’ responses to graphic horror , 1994 .

[12]  Byron Reeves,et al.  Negative video as structure: Emotion, attention, capacity, and memory , 1996 .

[13]  Ryohei Nakatsu,et al.  Life-like communication agent-emotion sensing character "MIC" and feeling session character "MUSE" , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[14]  Rosalind W. Picard Affective Computing , 1997 .

[15]  R. Simons,et al.  Roll ‘em!: The effects of picture motion on emotional responses , 1998 .

[16]  Alberto Del Bimbo,et al.  Semantics in Visual Information Retrieval , 1999, IEEE Multim..

[17]  R. Simons,et al.  Emotion processing in three systems: the medium and the message. , 1999, Psychophysiology.

[18]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Svetha Venkatesh,et al.  Novel approach to determining tempo and dramatic story sections in motion pictures , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[20]  Li-Qun Xu,et al.  User-oriented affective video content analysis , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[21]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[22]  Say Wei Foo,et al.  Speech emotion recognition using hidden Markov models , 2003, Speech Commun..

[23]  Lie Lu,et al.  Automatic mood detection from acoustic music data , 2003, ISMIR.

[24]  Hang-Bong Kang,et al.  Affective content detection using HMMs , 2003, ACM Multimedia.

[25]  PROCEssIng magazInE IEEE Signal Processing Magazine , 2004 .

[26]  Alan Hanjalic,et al.  Content-Based Analysis of Digital Video , 2004, Springer US.

[27]  Nuno Correia,et al.  VideoZapper: A System for Delivering Personalized Video Content , 2005, Multimedia Tools and Applications.

[28]  Alan Hanjalic,et al.  Adaptive extraction of highlights from a sport video based on excitement modeling , 2005, IEEE Transactions on Multimedia.

[29]  Mohamed S. Kamel,et al.  Segment-based approach to the recognition of emotions in speech , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[30]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[31]  Jun Wang,et al.  Distributed collaborative filtering for peer-to-peer file sharing systems , 2006, SAC.