Detecting the Reputation Polarity of Microblog Posts

We address the task of detecting the reputation polarity of social media updates, that is, deciding whether the content of an update has positive or negative implications for the reputation of a given entity. Typical approaches to this task include sentiment lexicons and linguistic features. However, they fall short in the social media domain because of its unedited and noisy nature, and, more importantly, because reputation polarity is not only encoded in sentiment-bearing words but it is also embedded in other word usage. To this end, automatic methods for extracting discriminative features for reputation polarity detection can play a role. We propose a data-driven, supervised approach for extracting textual features, which we use to train a reputation polarity classifier. Experiments on the RepLab 2013 collection show that our model outperforms the state-of-the-art method based on sentiment analysis by 20% accuracy.

[1]  Ted Pedersen,et al.  A Decision Tree of Bigrams is an Accurate Predictor of Word Sense , 2001, NAACL.

[2]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[3]  Silvio Amir,et al.  POPSTAR at RepLab 2013: Polarity for Reputation Classification , 2013, CLEF.

[4]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[5]  Julio Gonzalo,et al.  Overview of RepLab 2013: Evaluating Online Reputation Monitoring Systems , 2013, CLEF.

[6]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[7]  Vivek Narayanan,et al.  Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model , 2013, IDEAL.

[8]  Paul Rayson,et al.  Comparing Corpora using Frequency Profiling , 2000, Proceedings of the workshop on Comparing corpora -.

[9]  Julio Gonzalo,et al.  UNED Online Reputation Monitoring Team at RepLab 2013 , 2013, CLEF.

[10]  Julio Gonzalo,et al.  Overview of RepLab 2012: Evaluating Online Reputation Management Systems , 2012, CLEF.

[11]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[12]  Mike Thelwall,et al.  A Study of Information Retrieval Weighting Schemes for Sentiment Analysis , 2010, ACL.

[13]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[16]  José Saias,et al.  In Search of Reputation Assessment: Experiences with Polarity Classification in RepLab 2013 , 2013, CLEF.

[17]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[18]  K. R. Chandran,et al.  Enhancing Feature Selection Using Statistical Data with Unigrams and Bigrams , 2010 .

[19]  M. de Rijke,et al.  Predicting IMDB Movie Ratings Using Social Media , 2012, ECIR.

[20]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[21]  Tao Dong,et al.  An Improved Algorithm of Bayesian Text Categorization , 2011, J. Softw..

[22]  M. de Rijke,et al.  From Sentiment to Reputation , 2012, CLEF.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[25]  Richárd Farkas,et al.  Filtering and Polarity Detection for Reputation Management on Tweets , 2013, CLEF.

[26]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[27]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[28]  Patricio Martínez-Barco,et al.  DLSI-Volvam at RepLab 2013: Polarity Classification on Twitter Data , 2013, CLEF.

[29]  Ian Witten,et al.  Data Mining , 2000 .

[30]  Yunming Ye,et al.  An Improved Random Forest Classifier for Text Categorization , 2012, J. Comput..

[31]  Matthew Feczko,et al.  SentiSummary : Sentiment Summarization for User Product Reviews , 2010 .

[32]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .