Latent Hatred: A Benchmark for Understanding Implicit Hate Speech

Hate speech has grown significantly on social media, causing serious consequences for victims of all demographics. Despite much attention being paid to characterize and detect discriminatory speech, most work has focused on explicit or overt hate speech, failing to address a more pervasive form based on coded or indirect language. To fill this gap, this work introduces a theoretically-justified taxonomy of implicit hate speech and a benchmark corpus with fine-grained labels for each message and its implication. We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech, and we discuss key features that challenge existing models. This dataset will continue to serve as a useful benchmark for understanding this multifaceted issue. To download the data, see https://github.com/ GT-SALT/implicit-hate

[1]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[2]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[3]  Emily Ahn,et al.  Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts , 2019, EMNLP.

[4]  Michael Wiegand,et al.  Detection of Abusive Language: the Problem of Biased Datasets , 2019, NAACL.

[5]  Lucy Vasserman,et al.  Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.

[6]  Preslav Nakov,et al.  Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[7]  Yejin Choi,et al.  Social Bias Frames: Reasoning about Social and Power Implications of Language , 2020, ACL.

[8]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[9]  M. Inés Torres,et al.  Extracting relevant knowledge for the detection of sarcasm and nastiness in the social web , 2014, Knowl. Based Syst..

[10]  Heather Burnett,et al.  Dogwhistles as Identity-based interpretative variation , 2020, Passive and Active Network Measurement Conference.

[11]  J. Gubler,et al.  Violent Rhetoric in Protracted Group Conflicts , 2015 .

[12]  Cristina Bosco,et al.  An Impossible Dialogue! Nominal Utterances and Populist Rhetoric in an Italian Twitter Corpus of Hate Speech against Immigrants , 2018, LREC.

[13]  Katrine Fangen Cynthia Miller-Idriss: Hate in the Homeland: The New Global Far Right , 2021 .

[14]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[15]  Ingmar Weber,et al.  Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[16]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[17]  Xianzhi Wang,et al.  Deep learning for misinformation detection on online social networks: a survey and new perspectives , 2020, Social Network Analysis and Mining.

[18]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[19]  K. Somerville Violence, hate speech and inflammatory broadcasting in Kenya: The problems of definition and identification , 2011 .

[20]  V. Rodríguez,et al.  International Covenant on Civil and Political Rights. , 1988, Annual review of population law.

[21]  Jiebo Luo,et al.  Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks , 2018, ALW.

[22]  Yejin Choi,et al.  Social Chemistry 101: Learning to Reason about Social and Moral Norms , 2020, EMNLP.

[23]  Laura Beth Nielsen,et al.  Subtle, Pervasive, Harmful: Racist and Sexist Remarks in Public as Hate Speech , 2002 .

[24]  Ashish Sharma,et al.  A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support , 2020, EMNLP.

[25]  Solon Barocas,et al.  Language (Technology) is Power: A Critical Survey of “Bias” in NLP , 2020, ACL.

[26]  Yulia Tsvetkov,et al.  Demoting Racial Bias in Hate Speech Detection , 2020, SOCIALNLP.

[27]  Helen L. Norton,et al.  Intermediaries and Hate Speech: Fostering Digital Citizenship for Our Information Age , 2011 .

[28]  Lei Gao,et al.  Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach , 2017, IJCNLP.

[29]  M. Williams,et al.  Hate speech, machine classification and statistical modelling of information flows on Twitter: interpretation and communication for policy decision making , 2014 .

[30]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[31]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[32]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[33]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[34]  Mai ElSherief,et al.  Lifelong Learning of Hate Speech Classification on Social Media , 2021, NAACL.

[35]  Animesh Mukherjee,et al.  Hateminers : Detecting Hate speech against Women , 2018, ArXiv.

[36]  Burt L. Monroe,et al.  Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict , 2008, Political Analysis.

[37]  Jeffrey A. Gottfried,et al.  News use across social media platforms 2016 , 2016 .

[38]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[39]  Mitch Berbrier The Victim Ideology of White Supremacists and White Separatists in the United States , 2000 .

[40]  Tiffany L. Taylor,et al.  Playing the race card: White injury, White victimhood and the paradox of colour-blind ideology in anti-immigrant discourse , 2019, Ethnic and Racial Studies.

[41]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.

[42]  Kim Bartel Sheehan,et al.  Crowdsourcing research: Data collection with Amazon’s Mechanical Turk , 2018 .

[43]  Mai ElSherief,et al.  Learning to Decipher Hate Symbols , 2019, NAACL.

[44]  Anoop Sarkar,et al.  Decipherment of Substitution Ciphers with Neural Language Models , 2018, EMNLP.

[45]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[46]  Aida Mostafazadeh Davani,et al.  The Gab Hate Corpus: A collection of 27k posts annotated for hate speech , 2018 .

[47]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[48]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[49]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[50]  Alan W Black,et al.  Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings , 2019, NAACL.

[51]  James A. Piazza Politician hate speech and domestic terrorism , 2020, International Interactions.

[52]  Tommaso Caselli,et al.  I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language , 2020, LREC.

[53]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[54]  Lei Gao,et al.  Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[55]  David Jurgens,et al.  A Just and Comprehensive Strategy for Using NLP to Address Online Abuse , 2019, ACL.

[56]  Tanushree Mitra,et al.  Many Faced Hate: A Cross Platform Study of Content Framing and Information Sharing by Online Hate Groups , 2020, CHI.

[57]  Susan Benesch,et al.  Dangerous speech and dangerous ideology: an integrated model for monitoring and prevention , 2016 .

[58]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[59]  Ingmar Weber,et al.  Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[60]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[61]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[62]  Animesh Mukherjee,et al.  Thou shalt not hate: Countering Online Hate Speech , 2018, ICWSM.

[63]  Kevin W. Saunders What about Hate Speech , 2011 .

[64]  D. W. Sue Microaggressions in Everyday Life: Race, Gender, and Sexual Orientation , 2010 .

[65]  Alessandro Flammini,et al.  Predicting online extremism, content adopters, and interaction reciprocity , 2016, SocInfo.