Regressing Controversy of Music Artists from Microblogs

Social media represents a valuable data source for researchers to analyze how people feel about a variety of topics, from politics to products to entertainment. This paper addresses the detection of controversies involving music artists, based on microblogs. In particular, we develop a new controversy detection dataset consisting of 53,441 tweets related to 95 music artists, and we devise and evaluate a comprehensive set of user-and content-based feature candidates to regress controversy. The evaluation results show a strong performance of the presented approach in the controversy detection task: F1 score of 0.811 in a classification task and RMSE of 0.688 in a regression task, using controversy scores in the range [1, 4]. In addition, the results obtained in applying the presented approach on a dataset from a different domain (CNN news controversy) demonstrate transferability of the developed feature set, with a significant improvement over prior approaches. A combination of the adopted Gradient Boosting based classifier and the developed feature set results in an F1 score of 0.775, which represents an improvement of 9.8% compared to the best prior result on this dataset.

[1]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[2]  Petra Kralj Novak,et al.  Sentiment of Emojis , 2015, PloS one.

[3]  Ming Zhou,et al.  Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification , 2014, ACL.

[4]  Ana-Maria Popescu,et al.  Detecting controversies in Twitter: a first study , 2010, HLT-NAACL 2010.

[5]  Shiri Dori-Hacohen,et al.  Automated Controversy Detection on the Web , 2015, ECIR.

[6]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .

[7]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[8]  Mikalai Tsytsarau Scalable Detection of Sentiment-Based Contradictions , 2011 .

[9]  Aristides Gionis,et al.  Quantifying Controversy in Social Media , 2015, WSDM.

[10]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[11]  Mike Thelwall,et al.  The Heart and Soul of the Web? Sentiment Strength Detection in the Social Web with SentiStrength , 2017 .

[12]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[13]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[14]  M. Anu Sree,et al.  Controversy Trend Detection in Social Media , 2015 .

[15]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[16]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[17]  James Allan,et al.  Improving Automated Controversy Detection on the Web , 2016, SIGIR.

[18]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[19]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[20]  Ee-Peng Lim,et al.  On ranking controversies in wikipedia: models and evaluation , 2008, WSDM '08.

[21]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[22]  Shiri Dori-Hacohen,et al.  Detecting controversy on the web , 2013, CIKM.

[23]  Sung-Hyon Myaeng,et al.  Identifying Controversial Issues and Their Sub-topics in News Articles , 2010, PAISI.

[24]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[25]  Ana-Maria Popescu,et al.  Detecting controversial events from twitter , 2010, CIKM.

[26]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[27]  Björn W. Schuller,et al.  SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives , 2016, COLING.

[28]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[29]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[30]  Themis Palpanas,et al.  Scalable discovery of contradictions on the web , 2010, WWW '10.

[31]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[32]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.