Integrating Visual and Textual Affective Descriptors for Sentiment Analysis of Social Media Posts

Social media posts often contain a mixture of images and texts. This paper proposes an affective visual descriptor and an integrated visual-textual classification method for sentiment analysis in social media. Firstly a set of affective visual features is explored based on the theory of psychology and art. Secondly, a structured forest is proposed to generate bag of affective words (BoAW) from the joint distribution of ANP. The generated BoAW provides basic “visual cues” for sentiment analysis. Then a set of sentiment part (SSP) feature is introduced to integrate the visual and textual descriptors on multiple statistic manifolds. Multi-scale sentiment classification is finally applied through metric learning on the manifold kernels. In the proposed method, the re-trained class-activation map (CAM) on ILSVRC 2014 is applied and re-trained on an Adjective-Noun-Pair (ANP) labelled affective visual data set. The global average pooling (GAP) layer of CAM is used for discriminative localization, and the fully-connected layer is able to generate objective visual descriptors. 300 tweets with mixed images and texts are manually labelled and evaluated. The proposed structured forest is evaluated on ANP labelled image data set. Promising experimental results have been obtained, which shows the effectiveness of the proposed method for sentiment analysis on social media posts.

[1]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[2]  Misha Denil,et al.  From Group to Individual Labels Using Deep Features , 2015, KDD.

[3]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[4]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[5]  Tao Chen,et al.  DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks , 2014, ArXiv.

[6]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[7]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[8]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[9]  Baoxin Li,et al.  Unsupervised Sentiment Analysis for Social Media Images , 2015, IJCAI.

[10]  Ryan P. Adams,et al.  Training Restricted Boltzmann Machines on Word Observations , 2012, ICML.

[11]  Claire Cardie,et al.  Compositional Matrix-Space Models for Sentiment Analysis , 2011, EMNLP.

[12]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[14]  Rongrong Ji,et al.  Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.

[15]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Shuanglu Dai,et al.  Multi-scale Sentiment Classification Using Canonical Correlation Analysis on Riemannian Manifolds , 2016, 2016 IEEE International Symposium on Multimedia (ISM).

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Shih-Fu Chang,et al.  Going Deeper for Multilingual Visual Sentiment Detection , 2016, ArXiv.

[20]  Xavier Giró-i-Nieto,et al.  From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction , 2016, Image Vis. Comput..

[21]  Shuanglu Dai,et al.  A convolutional Riemannian texture model with differential entropic active contours for unsupervised pest detection , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Tao Chen,et al.  Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology , 2015, ACM Multimedia.