Large-scale visual sentiment ontology and detectors using adjective noun pairs

We address the challenge of sentiment analysis from visual content. In contrast to existing methods which infer sentiment or emotion directly from visual low-level features, we propose a novel approach based on understanding of the visual concepts that are strongly related to sentiments. Our key contribution is two-fold: first, we present a method built upon psychological theories and web mining to automatically construct a large-scale Visual Sentiment Ontology (VSO) consisting of more than 3,000 Adjective Noun Pairs (ANP). Second, we propose SentiBank, a novel visual concept detector library that can be used to detect the presence of 1,200 ANPs in an image. The VSO and SentiBank are distinct from existing work and will open a gate towards various applications enabled by automatic sentiment analysis. Experiments on detecting sentiment of image tweets demonstrate significant improvement in detection accuracy when comparing the proposed SentiBank based predictors with the text-based approaches. The effort also leads to a large publicly available resource consisting of a visual sentiment ontology, a large detector library, and the training/testing benchmark for visual sentiment analysis.

[1]  C. Darwin The Expression of the Emotions in Man and Animals , .

[2]  C. Osgood,et al.  The Measurement of Meaning , 1958 .

[3]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[4]  P. Ekman Facial expression and emotion. , 1993, The American psychologist.

[5]  P. Lang International Affective Picture System (IAPS) : Technical Manual and Affective Ratings , 1995 .

[6]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[7]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[8]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[9]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[10]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[11]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[12]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[13]  Tobun Dorbin Ng,et al.  Terrorism and Crime Related Weblog Social Network: Link, Content Analysis and Information Visualization , 2007, 2007 IEEE Intelligence and Security Informatics.

[14]  Qianhua He,et al.  A survey on emotional semantic image retrieval , 2008, 2008 15th IEEE International Conference on Image Processing.

[15]  Nicu Sebe,et al.  Emotional valence categorization using holistic image features , 2008, 2008 15th IEEE International Conference on Image Processing.

[16]  Hrishikesh B. Aradhye,et al.  Video2Text: Learning to Annotate Video Content , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[17]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[18]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[19]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009 .

[21]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[22]  Markus Koch,et al.  Learning automatic concept detectors from online video , 2010, Comput. Vis. Image Underst..

[23]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[24]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[25]  M. Thelwall,et al.  Sentiment Strength Detection in Short Informal Text 1 , 2010 .

[26]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[27]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[28]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[29]  Andrew W. Fitzgibbon,et al.  Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[30]  Gabriela Csurka,et al.  Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[31]  Mubarak Shah,et al.  A holistic approach to aesthetic enhancement of photographs , 2011, TOMCCAP.

[32]  K. Scherer,et al.  The Geneva affective picture database (GAPED): a new 730-picture database focusing on valence and normative significance , 2011, Behavior research methods.

[33]  Adrian Ulges,et al.  Lookapp: interactive construction of web-based concept detectors , 2011, ICMR '11.

[34]  Daniel P. W. Ellis,et al.  IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System , 2011, TRECVID.

[35]  Shih-Fu Chang,et al.  Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[36]  Jiebo Luo,et al.  Aesthetics and Emotions in Images , 2011, IEEE Signal Processing Magazine.

[37]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[38]  Jianxiong Xiao,et al.  What makes an image memorable , 2011 .

[39]  Marcel Worring,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Harvesting Social Images for Bi-Concept Search , 2022 .

[40]  Bing Li,et al.  Scaring or pleasing: exploit emotional impact of an image , 2012, ACM Multimedia.

[41]  Stefan Winkler,et al.  Emotion-based sequence of family photos , 2012, ACM Multimedia.

[42]  Paul Over,et al.  Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.

[43]  Martha Larson,et al.  Intent and its discontents: the user at the wheel of the online video search engine , 2012, ACM Multimedia.

[44]  Jie Tang,et al.  Understanding the emotional impact of images , 2012, ACM Multimedia.

[45]  Nicu Sebe,et al.  In the eye of the beholder: employing statistical analysis and eye tracking for analyzing abstract paintings , 2012, ACM Multimedia.

[46]  Jie Tang,et al.  Can we understand van gogh's mood?: learning to infer affects from images in social networks , 2012, ACM Multimedia.

[47]  Shih-Fu Chang,et al.  Designing Category-Level Attributes for Discriminative Visual Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Dong Liu,et al.  Towards a comprehensive computational model foraesthetic assessment of videos , 2013, MM '13.

[49]  W. Chu Studying Aesthetics in Photographic Images Using a Computational Approach , 2013 .