Hate versus politics: detection of hate against policy makers in Italian tweets

Accurate detection of hate speech against politicians, policy making and political ideas is crucial to maintain democracy and free speech. Unfortunately, the amount of labelled data necessary for training models to detect hate speech are limited and domain-dependent. In this paper, we address the issue of classification of hate speech against policy makers from Twitter in Italian, producing the first resource of this type in this language. We collected and annotated 1264 tweets, examined the cases of disagreements between annotators, and performed in-domain and cross-domain hate speech classifications with different features and algorithms. We achieved a performance of ROC AUC 0.83 and analyzed the most predictive attributes, also finding the different language features in the anti-policymakers and anti-immigration domains. Finally, we visualized networks of hashtags to capture the topics used in hateful and normal tweets.

[1]  Venkata Rama Kiran Garimella,et al.  A Long-Term Analysis of Polarization on Twitter , 2017, ICWSM.

[2]  Georg Groh,et al.  Impact of politically biased data on hate speech classification , 2020, ALW.

[3]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[4]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[5]  Nazli Goharian,et al.  Hate speech detection: Challenges and solutions , 2019, PloS one.

[6]  Dipanjan Das,et al.  BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.

[7]  L. Rossi,et al.  Multi-Party Media Partisanship Attention Score. Estimating Partisan Attention of News Media Sources Using Twitter Data in the Lead-up to 2018 Italian Election , 2019 .

[8]  Lei Shi,et al.  Visualizing large graphs , 2015 .

[9]  Serena Villata,et al.  A System to Monitor Cyberbullying based on Message Classification and Social Network Analysis , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[10]  M. Williams,et al.  Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime , 2019, The British Journal of Criminology.

[11]  Serena Villata,et al.  Comparing Different Supervised Approaches to Hate Speech Detection , 2018, EVALITA@CLiC-it.

[12]  Luis F. Luna-Reyes,et al.  Open data visualizations and analytics as tools for policy-making , 2019, Gov. Inf. Q..

[13]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.

[14]  Paolo Rosso,et al.  Online Hate Speech against Women: Automatic Identification of Misogyny and Sexism on Twitter , 2019, J. Intell. Fuzzy Syst..

[15]  Cristina Bosco,et al.  An Impossible Dialogue! Nominal Utterances and Populist Rhetoric in an Italian Twitter Corpus of Hate Speech against Immigrants , 2018, LREC.

[16]  Ahsan Adeel,et al.  Detecting hate speech against politicians in Arabic community on social media , 2020, Int. J. Web Inf. Syst..

[17]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18]  Fabio Celli,et al.  CorEA: Italian News Corpus with Emotions and Agreement , 2014 .

[19]  Animesh Mukherjee,et al.  Hateminers : Detecting Hate speech against Women , 2018, ArXiv.

[20]  Viviana Patti,et al.  Resources and benchmark corpora for hate speech detection: a systematic review , 2020, Language Resources and Evaluation.

[21]  Felice Dell'Orletta,et al.  Multi-task Learning in Deep Neural Networks at EVALITA 2018 , 2018, EVALITA@CLiC-it.

[22]  Dimosthenis Kyriazis,et al.  PolicyCLOUD: Analytics as a Service Facilitating Efficient Data-Driven Public Policy Management , 2020, AIAI.

[23]  Tommaso Caselli,et al.  RuG @ EVALITA 2018: Hate Speech Detection In Italian Social Media , 2018, EVALITA@CLiC-it.

[24]  Felice Dell'Orletta,et al.  Overview of the EVALITA 2018 Hate Speech Detection Task , 2018, EVALITA@CLiC-it.

[25]  Fabio Celli,et al.  The Role of Emotional Stability in Twitter Conversations , 2012, Comput. Intell..

[26]  Alfredo Milani,et al.  Detecting Hate Speech for Italian Language in Social Media , 2018, EVALITA@CLiC-it.

[27]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[28]  Tommaso Caselli,et al.  HaSpeeDe 2 @ EVALITA2020: Overview of the EVALITA 2020 Hate Speech Detection Task , 2020, EVALITA.

[29]  Cristina Bosco,et al.  Hate Speech Annotation: Analysis of an Italian Twitter Corpus , 2017, CLiC-it.

[30]  M. Williams,et al.  Corrigendum to: Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime , 2019, The British Journal of Criminology.

[31]  Paolo Rosso,et al.  Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI) , 2018, EVALITA@CLiC-it.

[32]  Karmen Erjavec,et al.  “You Don't Understand, This is a New War!” Analysis of Hate Speech in News Web Sites' Comments , 2012 .

[33]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[34]  Paolo Rosso,et al.  Hate Speech Detection Using Attention-based LSTM , 2018, EVALITA@CLiC-it.

[35]  Serena Villata,et al.  A Multilingual Evaluation for Online Hate Speech Detection , 2020, ACM Trans. Internet Techn..

[36]  Giovanni Semeraro,et al.  AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets , 2019, CLiC-it.

[37]  Serena Villata,et al.  Cross-Platform Evaluation for Italian Hate Speech Detection , 2019, CLiC-it.

[38]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[39]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[40]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .