AVAYA: Sentiment Analysis on Twitter with Self-Training and Polarity Lexicon Expansion

This paper describes the systems submitted by Avaya Labs (AVAYA) to SemEval-2013 Task 2 - Sentiment Analysis in Twitter. For the constrained conditions of both the message polarity classification and contextual polarity disambiguation subtasks, our approach centers on training high-dimensional, linear classifiers with a combination of lexical and syntactic features. The constrained message polarity model is then used to tag nearly half a million unlabeled tweets. These automatically labeled data are used for two purposes: 1) to discover prior polarities of words and 2) to provide additional training examples for self-training. Our systems performed competitively, placing in the top five for all subtasks and data conditions. More importantly, these results show that expanding the polarity lexicon and augmenting the training data with unlabeled tweets can yield improvements in precision and recall in classifying the polarity of non-neutral messages and contexts.

[1]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[2]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[3]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[4]  James Bailey,et al.  Sentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge , 2012, WISE.

[5]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[6]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[7]  Ari Rappoport,et al.  Semi-Supervised Recognition of Sarcasm in Twitter and Amazon , 2010, CoNLL.

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Delip Rao,et al.  Semi-Supervised Polarity Lexicon Induction , 2009, EACL.

[10]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[11]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[12]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[13]  Adam Lopez,et al.  Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) , 2009 .

[14]  Deyu Zhou,et al.  Self-training from labeled features for sentiment analysis , 2011, Inf. Process. Manag..

[15]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[16]  Oscar Täckström,et al.  Semi-supervised latent variable models for sentence-level sentiment analysis , 2011, ACL.

[17]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[18]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[19]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[20]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[21]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[22]  Saif Mohammad,et al.  #Emotional Tweets , 2012, *SEMEVAL.

[23]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[24]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[25]  Andrew McCallum,et al.  Transition-based Dependency Parsing with Selectional Branching , 2013, ACL.

[26]  Tejashri Inadarchand Jain,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2010 .

[27]  Vikas Sindhwani,et al.  Document-Word Co-regularization for Semi-supervised Sentiment Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.