An Ensemble Model for Cross-Domain Polarity Classification on Twitter

Polarity analysis of Social Media content is of significant importance for various applications. Most current approaches treat this task as a classification problem, demanding a labeled corpus for training purposes. However, if the learned model is applied on a different domain, the performance drops significantly and, given that it is impractical to have labeled corpora for every domain, this becomes a challenging task. In the current work, we address this problem, by proposing an ensemble classifier that is trained on a general domain and and adapts, without the need for additional ground truth, on the desired (test) domain before classifying a document. Our experiments are performed on three different datasets and the obtained results are compared with various baselines and state-of-the-art methods; we demonstrate that our model is outperforming all out-of-domain trained baseline algorithms, and that it is even comparable with different in-domain classifiers.

[1]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[2]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[3]  Yiannis Kompatsiaris,et al.  EventSense: capturing the pulse of large-scale events by mining social media streams , 2013, PCI '13.

[4]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[5]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[6]  Nicholas Diakopoulos,et al.  Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs , 2011, EMNLP.

[7]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[8]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[9]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[10]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[11]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[12]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[13]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[14]  Jeff Heflin,et al.  The Semantic Web – ISWC 2012 , 2012, Lecture Notes in Computer Science.

[15]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[16]  Sabine Bergler,et al.  When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[17]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[18]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[19]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.