An integrated explicit and implicit offensive language taxonomy

Abstract The current study represents an integrated model of explicit and implicit offensive language taxonomy. First, it focuses on a definitional revision and enrichment of the explicit offensive language taxonomy by reviewing the collection of available corpora and comparing tagging schemas applied there. The study relies mainly on the categories originally proposed by Zampieri et al. (2019) in terms of offensive language categorization schemata. After the explanation of semantic differences between particular concepts used in the tagging systems and the analysis of theoretical frameworks, a finite set of classes is presented, which cover aspects of offensive language representation along with linguistically sound explanations (Lewandowska-Tomaszczyk et al. 2021). In the analytic procedure, offensive from non-offensive discourse is first distinguished, with the question of offence Target and the following categorization levels and sublevels. Based on the relevant data generated from Sketch Engine (https://www.sketchengine.eu/ententen-english-corpus/), we propose the concept of offensive language as a superordinate category in our system with a number of hierarchically arranged 17 subcategories. The categories are taxonomically structured into 4 levels and verified with the use of neural-based (lexical) embeddings. Together with a taxonomy of implicit offensive language and its subcategorization levels which has received little scholarly attention until now, the categorization is exemplified in samples of offensive discourses in selected English social media materials, i.e., publicly available 25 web-based hate speech datasets (consult Appendix 1 for a complete list). The offensive category levels (types of offence, targets, etc.) and aspects (offensive language property clusters) as well as the categories of explicitness and implicitness are discussed in the study and the computationally verified integrated explicit and implicit offensive language taxonomy proposed in the study.

[1]  A. Bączkowska Explicit and implicit offensiveness in dialogical film discourse in Bridgit Jones films , 2022, International Review of Pragmatics.

[2]  Douwe Kiela,et al.  Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection , 2021, Annual Meeting of the Association for Computational Linguistics.

[3]  Seid Muhie Yimam,et al.  HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection , 2020, AAAI.

[4]  P. Koller,et al.  Emotional Behavior with Verbal Violence: Problems and Solutions , 2020 .

[5]  Tommaso Caselli,et al.  HateBERT: Retraining BERT for Abusive Language Detection in English , 2020, WOAH.

[6]  Viviana Patti,et al.  Resources and benchmark corpora for hate speech detection: a systematic review , 2020, Language Resources and Evaluation.

[7]  Waldemar Karwowski,et al.  Affective and Stress Consequences of Cyberbullying , 2020, Symmetry.

[8]  L. Hess Slurs and Expressive Commitments , 2020, Acta Analytica.

[9]  B. Lewandowska-Tomaszczyk,et al.  Culture-driven emotional profiles and online discourse extremism , 2020 .

[10]  Degen Huang,et al.  FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining , 2020, IJCAI.

[11]  Preslav Nakov,et al.  SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) , 2020, SEMEVAL.

[12]  Jonathan Culpeper,et al.  The metalinguistics of offence in (British) English , 2020, Thematic issue: New perspectives on conflict.

[13]  Samuel R. Bowman,et al.  Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? , 2020, ACL.

[14]  Maria Paola Tenchini,et al.  The Impoliteness of Slurs and Other Pejoratives in Reported Speech , 2020 .

[15]  C. Berg Slurs , 2019, The Classical Guitar Companion.

[16]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[17]  Eric Gilbert,et al.  The Internet's Hidden Rules , 2018, Proceedings of the ACM on Human-Computer Interaction.

[18]  Lynne Tirrell,et al.  Toxic Speech: Inoculations and Antidotes , 2018, The Southern Journal of Philosophy.

[19]  G. Nunberg The Social Life of Slurs , 2018, Oxford Scholarship Online.

[20]  Matthew Stone,et al.  Explicit Indirection , 2018, Oxford Scholarship Online.

[21]  F. Baider,et al.  Narrating hostility, challenging hostile narratives , 2018, Lodz Papers in Pragmatics.

[22]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[23]  B. Lewandowska-Tomaszczyk,et al.  Incivility and confrontation in online conflict discourses , 2017 .

[24]  Lei Gao,et al.  Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[25]  Cody Buntain,et al.  A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.

[26]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[27]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[28]  Bianca Cepollaro,et al.  In defence of a presuppositional account of slurs , 2015 .

[29]  Keith Allan,et al.  When is a slur not a slur? The use of nigger in ‘Pulp Fiction’ , 2015 .

[30]  Michael Haugh,et al.  Pragmatics and the English Language , 2014 .

[31]  Robin Jeshion EXPRESSIVISM AND THE OFFENSIVENESS OF SLURS , 2013 .

[32]  Karmen Erjavec,et al.  “You Don't Understand, This is a New War!” Analysis of Hate Speech in News Web Sites' Comments , 2012 .

[33]  Magnus Ljung,et al.  Swearing: A Cross-Cultural Linguistic Study , 2010 .

[34]  Penny M. Pexman,et al.  Some Insults are Easier to Detect: The Embodied Insult Detection Effect , 2010, Front. Psychology.

[35]  Christopher Hom The Semantics of Racial Epithets , 2008 .

[36]  Laura Leets,et al.  Explaining Perceptions of Racist Speech , 2001, Commun. Res..

[37]  Gino Eelen A Critique of Politeness Theory: Volume 1 , 2001 .

[38]  The linguistics of laughter , 1985, English Today.

[39]  C. Cousens Are ableist insults secretly slurs? , 2020 .

[40]  M. Adams,et al.  Teaching for Diversity and Social Justice , 2016 .

[41]  Zohar Kampf The politics of being insulted: The uses of hurt feelings in Israeli public discourse , 2015 .

[42]  Adam M. Croom The semantics of slurs: a refutation of pure expressivism , 2014 .

[43]  Francisco Yus,et al.  Towards a cross-cultural pragmatic taxonomy of insults , 2013 .

[44]  Jonathan Culpeper Impoliteness and Entertainment in the Television Quiz Show: The Weakest Link , 2005 .

[45]  Jennifer Hornsby Meaning and Uselessness: How to Think about Derogatory Words , 2001 .

[46]  K. Bach Conversational Impliciture KENT BACH , 1994 .

[47]  L. Goossens,et al.  Metaphtonymy: the interaction of metaphor and metonymy in expressions for linguistic action , 1990 .

[48]  George Lakoff,et al.  Women, Fire, and Dangerous Things , 1987 .

[49]  G. Frege I.—THE THOUGHT: A LOGICAL INQUIRY , 1956 .