Do Women Perceive Hate Differently: Examining the Relationship Between Hate Speech, Gender, and Agreement Judgments

Understanding hate speech remains a significant challenge for both creating reliable datasets and automated hate speech detection. We hypothesize that being part of the targeted group or personally agreeing with an assertion substantially effects hate speech perception. To test these hypotheses, we create FEMHATE – a dataset containing 400 assertions that target women. These assertion are judged by female and male subjects for (i) how hateful these assertions are and (ii) for whether they agree with the assertions. We find that women and men consistently evaluate extreme cases of hate speech. We also find a strong relationship between hate speech and agreement judgments, showing that a low agreement score is a prerequisite for hate speech. We show how this relationship can be used for automatic hate speech detection. Our best system based on agreement judgments outperforms a baseline SVM classifier (equipped with ngrams) by a wide margin.

[1]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[2]  Jeremy Reffin,et al.  Anti-social media , 2014 .

[3]  I-Hsien Ting,et al.  An Approach for Hate Groups Detection in Facebook , 2013 .

[4]  Torsten Zesch,et al.  DeepTC - An Extension of DKPro Text Classification for Fostering Reproducibility of Deep Learning Experiments , 2018, LREC.

[5]  Ellen Spertus,et al.  Smokey: Automatic Recognition of Hostile Messages , 1997, AAAI/IAAI.

[6]  Stan Matwin,et al.  From Argumentation Mining to Stance Classification , 2015, ArgMining@HLT-NAACL.

[7]  Saif Mohammad,et al.  Agree or Disagree: Predicting Judgments on Nuanced Assertions , 2018, *SEMEVAL.

[8]  Torsten Zesch,et al.  What Does This Imply? Examining the Impact of Implicitness on the Perception of Hate Speech , 2017, GSCL.

[9]  Njagi Dennis Gitari,et al.  A Lexicon-based Approach for Hate Speech Detection , 2015, MUE 2015.

[10]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[11]  Saif Mohammad,et al.  Quantifying Qualitative Data for Understanding Controversial Issues , 2018, LREC.

[12]  Fabrício Benevenuto,et al.  A Measurement Study of Hate Speech in Social Media , 2017, HT.

[13]  Maarten Versteegh,et al.  Learning Text Similarity with Siamese Recurrent Networks , 2016, Rep4NLP@ACL.

[14]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches , 2009, Natural Language Engineering.

[15]  Torsten Zesch,et al.  Stance-based Argument Mining - Modeling Implicit Argumentation Using Stance , 2016, KONVENS.

[16]  Jan Snajder,et al.  Back up your Stance: Recognizing Arguments in Online Discussions , 2014, ArgMining@ACL.

[17]  Eneko Agirre,et al.  SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.

[18]  T. McGonagle,et al.  The Council of Europe against online hate speech: Conundrums and challenges , 2013 .

[19]  Stan Matwin,et al.  Offensive Language Detection Using Multi-level Classification , 2010, Canadian Conference on AI.

[20]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[21]  Saif Mohammad,et al.  Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation , 2017, ACL.

[22]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[23]  Jun-Ming Xu,et al.  Learning from Bullying Traces in Social Media , 2012, NAACL.

[24]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[25]  Eduard H. Hovy,et al.  Squibs: What Is a Paraphrase? , 2013, CL.

[26]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[27]  Irfan Chaudhry,et al.  #Hashtagging hate: Using Twitter to track racism online , 2015, First Monday.

[28]  Jordan J. Louviere,et al.  Best-Worst Scaling: Theory, Methods and Applications , 2015 .

[29]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[30]  B. Orme MaxDiff Analysis : Simple Counting , Individual-Level Logit , and HB , 2009 .

[31]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[32]  Kirsti K. Cole “It's Like She's Eager to be Verbally Abused”: Twitter, Trolls, and (En)Gendering Disciplinary Rhetoric , 2015 .

[33]  Saif Mohammad,et al.  Capturing Reliable Fine-Grained Sentiment Associations by Crowdsourcing and Best–Worst Scaling , 2016, NAACL.

[34]  Ingmar Weber,et al.  Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[35]  Ashish Sureka,et al.  Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter , 2015, ICDCIT.

[36]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.