Where do emotions come from? Predicting the Emotion Stimuli Map

Which parts of an image evoke emotions in an observer? To answer this question, we introduce a novel problem in computer vision - predicting an Emotion Stimuli Map (ESM), which describes pixel-wise contribution to evoked emotions. Building a new image database, EmotionROI, as a benchmark for predicting the ESM, we find that the regions selected by saliency and objectness detection do not correctly predict the image regions which evoke emotion. Although objects represent important regions for evoking emotion, parts of the background are also important. Based on this fact, we propose using fully convolutional networks for predicting the ESM. Both qualitative and quantitative experimental results confirm that our method can predict the regions which evoke emotion better than both saliency and objectness detection.

[1]  Vibhav Vineet,et al.  Efficient Salient Region Detection with Soft Image Abstraction , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  K. Scherer,et al.  The Geneva affective picture database (GAPED): a new 730-picture database focusing on valence and normative significance , 2011, Behavior research methods.

[5]  Lianhong Cai,et al.  Interpretable aesthetic features for affective image classification , 2013, 2013 IEEE International Conference on Image Processing.

[6]  Reiner Lenz,et al.  Emotion related structures in large image databases , 2010, CIVR '10.

[7]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Shih-Fu Chang,et al.  Predicting Viewer Perceived Emotions in Animated GIFs , 2014, ACM Multimedia.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11]  Tao Chen,et al.  Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology , 2015, ACM Multimedia.

[12]  P. Ekman What emotion categories or dimensions can observers judge from facial behavior , 1982 .

[13]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[14]  Tsuhan Chen,et al.  A mixed bag of emotions: Model, predict, and transfer emotion distributions , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jiebo Luo,et al.  A computational approach to determination of main subject regions in photographic images , 2004, Image Vis. Comput..

[16]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[18]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.