Automatic Classification of Abusive Language and Personal Attacks in Various Forms of Online Communication

The sheer ease with which abusive and hateful utterances can be made online – typically from the comfort of your home and the lack of any immediate negative repercussions – using today’s digital communication technologies (especially social media), is responsible for their significant increase and global ubiquity. Natural Language Processing technologies can help in addressing the negative effects of this development. In this contribution we evaluate a set of classification algorithms on two types of user-generated online content (tweets and Wikipedia Talk comments) in two languages (English and German). The different sets of data we work on were classified towards aspects such as racism, sexism, hatespeech, aggression and personal attacks. While acknowledging issues with inter-annotator agreement for classification tasks using these labels, the focus of this paper is on classifying the data according to the annotated characteristics using several text classification algorithms. For some classification tasks we are able to reach f-scores of up to 81.58.

[1]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[2]  Walter Daelemans,et al.  Detection and Fine-Grained Classification of Cyberbullying Events , 2015, RANLP.

[3]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[4]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[5]  Xiaogang Wang,et al.  A roadmap of clustering algorithms: finding a match for a biomedical application , 2008, Briefings Bioinform..

[6]  T. Massaro,et al.  Equality and Freedom of Expression: The Hate Speech Dilemma , 1991 .

[7]  K. Mitchell,et al.  Online Harassment in Context: Trends From Three Youth Internet Safety Surveys (2000, 2005, 2010) , 2013 .

[8]  Felix Sasaki,et al.  Digitale Kuratierungstechnologien: Verfahren für die effiziente Verarbeitung, Erstellung und Verteilung qualitativ hochwertiger Medieninhalte , 2015, GSCL.

[9]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[10]  James Banks,et al.  Regulating hate speech online , 2010 .

[11]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[12]  Nicola Döring,et al.  Personal Home Pages on the Web: A Review of Research , 2006, J. Comput. Mediat. Commun..

[13]  Alice E. Marwick,et al.  Online Harassment, Defamation, and Hateful Speech: A Primer of the Legal Landscape , 2014 .

[14]  Georg Rehm,et al.  Internetkommunikation und Sprachwandel , 1997 .

[15]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[16]  Robin M. Kowalski,et al.  Psychological, physical, and academic correlates of cyberbullying and traditional bullying. , 2013, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[17]  Lucas Dixon,et al.  Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[18]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[19]  David Crystal,et al.  Language and the Internet , 2001 .

[20]  Whitney Phillips This Is Why We Can't Have Nice Things: Mapping the Relationship between Online Trolling and Mainstream Culture , 2015 .

[21]  Harald Dreßing,et al.  Cyberstalking in a Large Sample of Social Network Users: Prevalence, Characteristics, and Impact Upon Victims , 2014, Cyberpsychology Behav. Soc. Netw..

[22]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[23]  Vasile Palade,et al.  Multi-Classifier Systems: Review and a roadmap for developers , 2006, Int. J. Hybrid Intell. Syst..

[24]  Peter Schlobinski,et al.  Sprache und Kommunikation im Internet : Überblick und Analysen , 1998 .