Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of sentiment- and emotion-polarized visual concepts by adapting semantic structures called adjective-noun pairs, originally introduced by Borth et al. (2013), but in a multilingual context. We propose a new language-dependent method for automatic discovery of these adjective-noun constructs. We show how this pipeline can be applied on a social multimedia platform for the creation of a large-scale multilingual visual sentiment concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our unified ontology is organized hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. In addition, we present an image-based prediction task to show how generalizable language-specific models are in a multilingual context. A new, publicly available dataset of >15.6K sentiment-biased visual concepts across 12 languages with language-specific detector banks, >7.36M images and their metadata is also released.

[1]  Shih-Fu Chang,et al.  Predicting Viewer Perceived Emotions in Animated GIFs , 2014, ACM Multimedia.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  J. Sabini Culture and emotion. , 1997 .

[4]  Jiebo Luo,et al.  The wisdom of social multimedia: using flickr for prediction and forecast , 2010, ACM Multimedia.

[5]  Kemal Oflazer,et al.  Parsing Turkish using the lexical functional grammar formalism , 2004, Machine Translation.

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  R. Zajonc Feeling and thinking : Preferences need no inferences , 1980 .

[8]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[9]  Steven Skiena,et al.  International Sentiment Analysis for News and Blogs , 2021, ICWSM.

[10]  András Kornai,et al.  HunPos: an open source trigram tagger , 2007, ACL 2007.

[11]  Jianxiong Xiao,et al.  What makes an image memorable? , 2011, CVPR 2011.

[12]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[13]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[14]  P. Lang International Affective Picture System (IAPS) : Technical Manual and Affective Ratings , 1995 .

[15]  J. Stephen Downie,et al.  Challenges in Cross-Cultural/Multilingual Music Information Seeking , 2005, ISMIR.

[16]  Tao Chen,et al.  DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks , 2014, ArXiv.

[17]  James P. Bagrow,et al.  Human language reveals a universal positivity bias , 2014, Proceedings of the National Academy of Sciences.

[18]  Rossano Schifanella,et al.  6 Seconds of Sound and Vision: Creativity in Micro-videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Xiangyang Xue,et al.  Predicting Emotions in User-Generated Videos , 2014, AAAI.

[20]  Rada Mihalcea,et al.  Multilingual Subjectivity Analysis Using Machine Translation , 2008, EMNLP.

[21]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[22]  Edward A. Vessel,et al.  Personalized visual aesthetics , 2014, Electronic Imaging.

[23]  Mohammad Soleymani,et al.  A Multimodal Database for Affect Recognition and Implicit Tagging , 2012, IEEE Transactions on Affective Computing.

[24]  Nicu Sebe,et al.  Emotional valence categorization using holistic image features , 2008, 2008 15th IEEE International Conference on Image Processing.

[25]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[26]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[27]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[28]  E. Doyle McCarthy,et al.  The Social Construction of Emotions: New Directions from Culture Theory , 1994 .

[29]  Raffay Hamid,et al.  What makes an image popular? , 2014, WWW.

[30]  Luc Van Gool,et al.  The Interestingness of Images , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Dong Liu,et al.  Towards a comprehensive computational model foraesthetic assessment of videos , 2013, MM '13.

[32]  H. Markus,et al.  Culture and the self: Implications for cognition, emotion, and motivation. , 1991 .

[33]  Thierry Pun,et al.  DEAP: A Database for Emotion Analysis ;Using Physiological Signals , 2012, IEEE Transactions on Affective Computing.

[34]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[35]  Yi-Hsuan Yang,et al.  Cross-cultural mood regression for music digital libraries , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[36]  L. R. Moscovice Max Planck Institute for Evolutionary Anthropology, Department of Primatology , 2017 .

[37]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[38]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[39]  Tao Chen,et al.  Predicting Viewer Affective Comments Based on Image Content in Social Media , 2014, ICMR.

[40]  J. Russell Culture and the categorization of emotions. , 1991, Psychological bulletin.

[41]  Rosalind W. Picard Affective computing: (526112012-054) , 1997 .

[42]  Jie Tang,et al.  Can we understand van gogh's mood?: learning to infer affects from images in social networks , 2012, ACM Multimedia.

[43]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[44]  Jiebo Luo,et al.  Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks , 2015, AAAI.

[45]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[46]  Rongrong Ji,et al.  Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.

[47]  K. Scherer,et al.  The Geneva affective picture database (GAPED): a new 730-picture database focusing on valence and normative significance , 2011, Behavior research methods.

[48]  Alexandra Balahur,et al.  Multilingual Sentiment Analysis using Machine Translation? , 2012, WASSA@ACL.

[49]  P. Ekman Facial expression and emotion. , 1993, The American psychologist.

[50]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[51]  Rongrong Ji,et al.  SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content , 2013, ACM Multimedia.