论文信息 - Hate Speech Dataset from a White Supremacy Forum

Hate Speech Dataset from a White Supremacy Forum

Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic. Due to the massive rise of user-generated web content on social media, the amount of hate speech is also steadily increasing. Over the past years, interest in online hate speech detection and, particularly, the automation of this task has continuously grown, along with the societal impact of the phenomenon. This paper describes a hate speech dataset composed of thousands of sentences manually labelled as containing hate speech or not. The sentences have been extracted from Stormfront, a white supremacist forum. A custom annotation tool has been developed to carry out the manual labelling task which, among other things, allows the annotators to choose whether to read the context of a sentence before labelling it. The paper also provides a thoughtful qualitative and quantitative study of the resulting dataset and several baseline experiments with different classification models. The dataset is publicly available.

[1] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .

[2] Yuzhou Wang,et al. Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[3] Shervin Malmasi,et al. Detecting Hate Speech in Social Media , 2017, RANLP.

[4] Lisa Kaati,et al. Measuring online affects in a white supremacy forum , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[5] Kevin W. Saunders. What about Hate Speech , 2011 .

[6] Dirk Hovy,et al. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[7] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.

[8] Njagi Dennis Gitari,et al. A Lexicon-based Approach for Hate Speech Detection , 2015, MUE 2015.

[9] J. Schafer. Spinning the web of hate : web-based hate propagation by extremist organizations , 2002 .

[10] Gianluca Stringhini,et al. Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words , 2017, ALW@ACL.

[11] Vasudeva Varma,et al. Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[12] Priscilla Marie Meddaugh,et al. Hate Speech or “Reasonable Racism?” The Other in Stormfront , 2009 .

[13] Tomoaki Ohtsuki,et al. Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection , 2018, IEEE Access.

[14] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[15] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[16] Fabrício Benevenuto,et al. Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[17] Jing Zhou,et al. Hate Speech Detection with Comment Embeddings , 2015, WWW.

[18] Felice Dell'Orletta,et al. Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[19] Matthew Leighton Williams,et al. Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[20] Björn Gambäck,et al. Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[21] Nikhil Ketkar,et al. Convolutional Neural Networks , 2021, Deep Learning with Python.

[22] Derek Ruths,et al. A Web of Hate: Tackling Hateful Speech in Online Social Spaces , 2017, ArXiv.

[23] Julia Hirschberg,et al. Detecting Hate Speech on the World Wide Web , 2012 .

[24] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[25] German Rigau,et al. IXA pipeline: Efficient and Ready to Use Multilingual NLP tools , 2014, LREC.

[26] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[27] Manish Shrivastava,et al. Degree based Classification of Harmful Speech using Twitter Data , 2018, TRAC@COLING 2018.

[28] Björn Ross,et al. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[29] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[30] Shervin Malmasi,et al. Challenges in discriminating profanity from hate speech , 2017, J. Exp. Theor. Artif. Intell..

[31] Ingmar Weber,et al. Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.