Towards a Comprehensive Taxonomy and Large-Scale Annotated Corpus for Online Slur Usage

Abusive language classifiers have been shown to exhibit bias against women and racial minorities. Since these models are trained on data that is collected using keywords, they tend to exhibit a high sensitivity towards pejoratives. As a result, comments written by victims of abuse are frequently labelled as hateful, even if they discuss or reclaim slurs. Any attempt to address bias in keyword-based corpora requires a better understanding of pejorative language, as well as an equitable representation of targeted users in data collection. We make two main contributions to this end. First, we provide an annotation guide that outlines 4 main categories of online slur usage, which we further divide into a total of 12 sub-categories. Second, we present a publicly available corpus based on our taxonomy, with 39.8k human annotated comments extracted from Reddit. This corpus was annotated by a diverse cohort of coders, with Shannon equitability indices of 0.90, 0.92, and 0.87 across sexuality, ethnicity, and gender. Taken together, our taxonomy and corpus allow researchers to evaluate classifiers on a wider range of speech containing slurs.

[1]  Yulia Tsvetkov,et al.  Demoting Racial Bias in Hate Speech Detection , 2020, SOCIALNLP.

[2]  Viviana Patti,et al.  Do You Really Want to Hurt Me? Predicting Abusive Swearing in Social Media , 2020, LREC.

[3]  Jeremy Blackburn,et al.  The Pushshift Reddit Dataset , 2020, ICWSM.

[4]  Ayellet Pelled,et al.  Staying silent and speaking out in online comment sections: The influence of spiral of silence and corrective action in reaction to news , 2020, Comput. Hum. Behav..

[5]  Hatem Haddad,et al.  T-HSAB: A Tunisian Hate Speech and Abusive Dataset , 2019, ICALP.

[6]  Jing Qian,et al.  A Benchmark Dataset for Learning to Intervene in Online Hate Speech , 2019, EMNLP.

[7]  Erik Velldal,et al.  THREAT: A Large Annotated Corpus for Detection of Violent Threats , 2019, 2019 International Conference on Content-Based Multimedia Indexing (CBMI).

[8]  Marco Guerini,et al.  CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech , 2019, ACL.

[9]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.

[10]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[11]  Michael Wiegand,et al.  Detection of Abusive Language: the Problem of Biased Datasets , 2019, NAACL.

[12]  Munmun De Choudhury,et al.  Prevalence and Psychological Effects of Hateful Speech in Online College Communities , 2019, WebSci.

[13]  Ingmar Weber,et al.  Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[14]  Yoav Goldberg,et al.  Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[15]  Preslav Nakov,et al.  Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[16]  Hatem Haddad,et al.  L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[17]  Junyi Jessy Li,et al.  Why Swear? Analyzing and Inferring the Intentions of Vulgar Expressions , 2018, EMNLP.

[18]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[19]  Junyi Jessy Li,et al.  Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media , 2018, COLING.

[20]  Cristina Bosco,et al.  An Impossible Dialogue! Nominal Utterances and Populist Rhetoric in an Italian Twitter Corpus of Hate Speech against Immigrants , 2018, LREC.

[21]  Mai ElSherief,et al.  Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media , 2018, ICWSM.

[22]  Ritesh Kumar,et al.  Aggression-annotated Corpus of Hindi-English Code-mixed Data , 2018, LREC.

[23]  Mikolaj Winiewski,et al.  Exposure to hate speech increases prejudice through desensitization , 2018, Aggressive behavior.

[24]  Amit P. Sheth,et al.  A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research , 2018, WebSci.

[25]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[26]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[27]  Mikhail Khodak,et al.  A Large Self-Annotated Corpus for Sarcasm , 2017, LREC.

[28]  Indra Budi,et al.  A Dataset and Preliminaries Study for Abusive Language Detection in Indonesian Social Media , 2018 .

[29]  Sara Tonelli,et al.  Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying , 2018, ALW.

[30]  Vinay Singh,et al.  A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection , 2018, PEOPLES@NAACL-HTL.

[31]  Paolo Rosso,et al.  Overview of the Task on Automatic Misogyny Identification at IberEval 2018 , 2018, IberEval@SEPLN.

[32]  Lei Gao,et al.  Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[33]  Derek Ruths,et al.  A Web of Hate: Tackling Hateful Speech in Online Social Spaces , 2017, ArXiv.

[34]  Radhika Mamidi,et al.  When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data , 2017, NLP+CSS@ACL.

[35]  Alexei Bastidas,et al.  Technology Solutions to Combat Online Harassment , 2017, ALW@ACL.

[36]  Cody Buntain,et al.  A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.

[37]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[38]  Gianluca Stringhini,et al.  Mean Birds: Detecting Aggression and Bullying on Twitter , 2017, WebSci.

[39]  Lucas Dixon,et al.  Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[40]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[41]  Joel R. Tetreault,et al.  Automatically Identifying Good Conversations Online (Yes, They Do Exist!) , 2017, ICWSM.

[42]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[43]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[44]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[45]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[46]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[47]  Adam M. Croom The semantics of slurs: A refutation of coreferentialism , 2015 .

[48]  Elisabeth Camp Sarcasm, Pretense, and The Semantics/ Pragmatics Distinction ∗ , 2012 .

[49]  Tanya Notley,et al.  Young People, Online Networks, and Social Inclusion , 2009, J. Comput. Mediat. Commun..

[50]  David R Williams,et al.  Online racial discrimination and psychological adjustment among adolescents. , 2008, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[51]  Christopher Hom The Semantics of Racial Epithets , 2008 .

[52]  Henry Jenkins Confronting the Challenges of Participatory Culture: Media Education for the 21st Century , 2006 .

[53]  Deirdre Wilson,et al.  The pragmatics of verbal irony: Echo or pretence? , 2006 .

[54]  P. Mcintosh White privilege: Unpacking the invisible knapsack. , 2003 .

[55]  Robert J. Boeckmann,et al.  Hate Speech: Asian American Students’ Justice Judgments and Psychological Responses , 2002 .

[56]  P. Gove "Noun Often Attributive" and "Adjective" , 1964 .