Tweester at SemEval-2016 Task 4: Sentiment Analysis in Twitter Using Semantic-Affective Model Adaptation

We describe our submission to SemEval2016 Task 4: Sentiment Analysis in Twitter. The proposed system ranked first for the subtask B. Our system comprises of multiple independent models such as neural networks, semantic-affective models and topic modeling that are combined in a probabilistic way. The novelty of the system is the employment of a topic modeling approach in order to adapt the semantic-affective space for each tweet. In addition, significant enhancements were made in the main system dealing with the data preprocessing and feature extraction including the employment of word embeddings. Each model is used to predict a tweet’s sentiment (positive, negative or neutral) and a late fusion scheme is adopted for the final decision.

[1]  Pedro Antonio Gutiérrez,et al.  Ordinal Regression Methods: Survey and Experimental Study , 2016, IEEE Transactions on Knowledge and Data Engineering.

[2]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[3]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[4]  Amac Herdagdelen,et al.  Twitter n-gram corpus with demographic metadata , 2013, Language Resources and Evaluation.

[5]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6]  Matthias Hagen,et al.  Webis: An Ensemble for Twitter Sentiment Detection , 2015, *SEMEVAL.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Alexandros Potamianos,et al.  Cognitively Motivated Distributional Representations of Meaning , 2016, LREC.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Claire Cardie,et al.  Multi-aspect Sentiment Analysis with Topic Models , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[11]  Tobias Günther,et al.  GU-MLT-LT: Sentiment Analysis of Short Messages using Linguistic Features and Stochastic Gradient Descent , 2013, *SEMEVAL.

[12]  Preslav Nakov,et al.  SemEval-2015 Task 10: Sentiment Analysis in Twitter , 2015, *SEMEVAL.

[13]  Leysia Palen,et al.  Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency , 2011, ICWSM.

[14]  Thomas Hofmann,et al.  Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization , 1999, NIPS.

[15]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[16]  SangKeun Lee,et al.  Joint multi-grain topic sentiment: modeling semantic aspects for online reviews , 2016, Inf. Sci..

[17]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[18]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[19]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[20]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[21]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[22]  Shrikanth S. Narayanan,et al.  SAIL: Sentiment Analysis using Semantic Similarity and Contrast Features , 2014, *SEMEVAL.

[23]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[24]  Susan T. Dumais,et al.  Improving information retrieval using latent semantic indexing , 1988 .

[25]  Ramón Fernández Astudillo,et al.  INESC-ID: Sentiment Analysis without Hand-Coded Features or Linguistic Resources using Embedding Subspaces , 2015, *SEMEVAL.

[26]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[28]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[29]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[30]  Shrikanth S. Narayanan,et al.  Distributional Semantic Models for Affective Text Analysis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[32]  Venu Govindaraju,et al.  Review of Classifier Combination Methods , 2008, Machine Learning in Document Analysis and Recognition.

[33]  Stefan Evert,et al.  KLUE: Simple and robust methods for polarity classification , 2013, *SEMEVAL.

[34]  Tomoko Ohkuma,et al.  TeamX: A Sentiment Analyzer with Enhanced Lexicon Mapping and Weighting Scheme for Unbalanced Data , 2014, *SEMEVAL.

[35]  Preslav Nakov,et al.  SemEval-2014 Task 9: Sentiment Analysis in Twitter , 2014, *SEMEVAL.

[36]  Alessandro Moschitti,et al.  UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification , 2015, *SEMEVAL.

[37]  Davide Buscaldi,et al.  From humor recognition to irony detection: The figurative language of social media , 2012, Data Knowl. Eng..

[38]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[39]  Yanghui Rao,et al.  Contextual Sentiment Topic Model for Adaptive Social Emotion Classification , 2016, IEEE Intelligent Systems.

[40]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[41]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[42]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[43]  Alexandros Potamianos,et al.  Valence, arousal and dominance estimation for English, German, Greek, Portuguese and Spanish lexica using semantic models , 2015, INTERSPEECH.