Levels of Hate in Online Environments

Hate speech in online environments is a severe problem for many reasons. The space for reasoning and argumentation shrinks, individuals refrain from expressing their opinions, and polarization of views increases. Hate speech contributes to a climate where threats and even violence are increasingly regarded as acceptable. The amount and the intensity of hate expressions vary greatly between different digital environments. To analyze the level of hate in a given online environment, to study the development over time and to compare the level of hate within online environments we have developed the notion of a hate level. The hate level encapsulates the level of hate in a given digital environment. We present methods to automatically determine the hate level, utilizing transfer learning on pre-trained language models with annotated data to create automated hate detectors. We evaluate our approaches on a set of websites and discussion forums.

[1]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[2]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[3]  Lisa Kaati,et al.  Directed Digital Hate , 2018, 2018 IEEE International Conference on Intelligence and Security Informatics (ISI).

[4]  Robert J. Sternberg,et al.  The nature of hate , 2008 .

[5]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Eric Gilbert,et al.  The Bag of Communities: Identifying Abusive Behavior Online with Preexisting Internet Data , 2017, CHI.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[9]  Benno Stein,et al.  Overview of the Celebrity Profiling Task at PAN 2020 , 2019, CLEF.

[10]  Shervin Malmasi,et al.  Detecting Hate Speech in Social Media , 2017, RANLP.

[11]  Erik Velldal,et al.  Threat detection in online discussions , 2016, WASSA@NAACL-HLT.

[12]  Bernard C. Y. Tan,et al.  Group Polarization and Computer-Mediated Communication: Effects of Communication Cues, Social Presence, and Anonymity , 2002, Inf. Syst. Res..

[13]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[14]  Magnus Sahlgren,et al.  Monitoring Targeted Hate in Online Environments , 2018, ArXiv.

[15]  Radha Poovendran,et al.  Deceiving Google's Perspective API Built for Detecting Toxic Comments , 2017, ArXiv.

[16]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[17]  Mai ElSherief,et al.  Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media , 2018, ICWSM.

[18]  Lisa Kaati,et al.  Measuring online affects in a white supremacy forum , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[19]  Sebastian Ruder,et al.  Fine-tuned Language Models for Text Classification , 2018, ArXiv.

[20]  Walid Magdy,et al.  Abusive Language Detection on Arabic Social Media , 2017, ALW@ACL.