In Data We Trust: A Critical Analysis of Hate Speech Detection Datasets

Recently, a few studies have discussed the limitations of datasets collected for the task of detecting hate speech from different viewpoints. We intend to contribute to the conversation by providing a consolidated overview of these issues pertaining to the data that debilitate research in this area. Specifically, we discuss how the varying pre-processing steps and the format for making data publicly available result in highly varying datasets that make an objective comparison between studies difficult and unfair. There is currently no study (to the best of our knowledge) focused on comparing the attributes of existing datasets for hate speech detection, outlining their limitations and recommending approaches for future research. This work intends to fill that gap and become the one-stop shop for information regarding hate speech datasets.

[1]  Marco Guerini,et al.  CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech , 2019, ACL.

[2]  Shervin Malmasi,et al.  Challenges in discriminating profanity from hate speech , 2017, J. Exp. Theor. Artif. Intell..

[3]  Roy Ka-Wei Lee,et al.  On Analyzing Annotation Consistency in Online Abusive Behavior Datasets , 2020, ArXiv.

[4]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[5]  Stan Matwin,et al.  Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs , 2018, ALW.

[6]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[7]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[8]  Tomoaki Ohtsuki,et al.  Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection , 2018, IEEE Access.

[9]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[10]  Ingmar Weber,et al.  Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[11]  Barbara Di Eugenio,et al.  Squibs and Discussions: The Kappa Statistic: A Second Look , 2004, CL.

[12]  Barbara Poblete,et al.  Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation , 2019, SIGIR.

[13]  Ziqi Zhang,et al.  Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter , 2018, Semantic Web.

[14]  Scott A. Hale,et al.  Challenges and frontiers in abusive content detection , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[15]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[16]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[17]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[18]  Christos Karatsalos,et al.  Attention-based method for categorizing different types of online harassment language , 2019, PKDD/ECML Workshops.

[19]  Athena Vakali,et al.  A Unified Deep Learning Architecture for Abuse Detection , 2018, WebSci.

[20]  Walter Daelemans,et al.  A Dictionary-based Approach to Racism Detection in Dutch Social Media , 2016, ArXiv.

[21]  Abiola Osho,et al.  Implicit Crowdsourcing for Identifying Abusive Behavior in Online Social Networks , 2020, ArXiv.

[22]  Torsten Zesch,et al.  What Does This Imply? Examining the Impact of Implicitness on the Perception of Hate Speech , 2017, GSCL.

[23]  Lei Gao,et al.  Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach , 2017, IJCNLP.

[24]  Emre Kıcıman,et al.  Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2018, Front. Big Data.

[25]  Paula Fortuna,et al.  Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets , 2020, LREC.

[26]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[27]  Kush R. Varshney,et al.  The Limits of Abstract Evaluation Metrics: The Case of Hate Speech Detection , 2017, WebSci.

[28]  Pascale Fung,et al.  Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.

[29]  Soroush Vosoughi,et al.  Enhanced Offensive Language Detection Through Data Augmentation , 2020, ArXiv.

[30]  Jing Qian,et al.  A Benchmark Dataset for Learning to Intervene in Online Hate Speech , 2019, EMNLP.

[31]  David Robinson,et al.  Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[32]  Noel Crespi,et al.  A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media , 2019, COMPLEX NETWORKS.

[33]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[34]  Vishwa Vinay,et al.  "To Target or Not to Target": Identification and Analysis of Abusive Text Using Ensemble of Classifiers , 2020, ArXiv.

[35]  Amit Awekar,et al.  Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms , 2018, ECIR.

[36]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[37]  Kosisochukwu Judith Madukwe,et al.  The Thin Line Between Hate and Profanity , 2019, Australasian Conference on Artificial Intelligence.

[38]  Kristian Miok,et al.  Prediction Uncertainty Estimation for Hate Speech Classification , 2019, SLSP.

[39]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[40]  Shervin Malmasi,et al.  Detecting Hate Speech in Social Media , 2017, RANLP.

[41]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[42]  Carlos Ortiz,et al.  Intersectional Bias in Hate Speech and Abusive Language Datasets , 2020, ArXiv.

[43]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[44]  Ingmar Weber,et al.  Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[45]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[46]  Björn Gambäck,et al.  Studying Generalisability across Abusive Language Detection Datasets , 2019, CoNLL.

[47]  Thomas Davidson,et al.  Examining Racial Bias in an Online Abuse Corpus with Structural Topic Modeling , 2020, ArXiv.

[48]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.

[49]  Michael Wiegand,et al.  Detection of Abusive Language: the Problem of Biased Datasets , 2019, NAACL.

[50]  Pete Burnap,et al.  Us and them: identifying cyber hate on Twitter across multiple protected characteristics , 2016, EPJ Data Science.

[51]  Joachim Bingel,et al.  Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection , 2018 .

[52]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[53]  Amanpreet Singh,et al.  The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes , 2020, NeurIPS.

[54]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[55]  Lluis Gomez,et al.  Exploring Hate Speech Detection in Multimodal Publications , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[56]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.