SemEval-2023 Task 10: Explainable Detection of Online Sexism

Online sexism is a widespread and harmful phenomenon. Automated tools can assist the detection of sexism at scale. Binary detection, however, disregards the diversity of sexist content, and fails to provide clear explanations for why something is sexist. To address this issue, we introduce SemEval Task 10 on the Explainable Detection of Online Sexism (EDOS). We make three main contributions: i) a novel hierarchical taxonomy of sexist content, which includes granular vectors of sexism to aid explainability; ii) a new dataset of 20,000 social media comments with fine-grained labels, along with larger unlabelled datasets for model adaptation; and iii) baseline models as well as an analysis of the methods, results and errors for participant submissions to our task.

[1]  Kyung-ah Sohn,et al.  Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection , 2022, COLING.

[2]  Federico Bianchi,et al.  Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages , 2022, EMNLP.

[3]  Zhaojiang Lin,et al.  Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values , 2022, TRUSTNLP.

[4]  B. Schölkopf,et al.  When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment , 2022, NeurIPS.

[5]  Scott A. Hale,et al.  Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning , 2022, TRAC.

[6]  Omar U. Florez,et al.  TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter , 2022, KDD.

[7]  William S. Isaac,et al.  Power to the People? Opportunities and Challenges for Participatory AI , 2022, EAAMO.

[8]  A. Zubiaga,et al.  Hidden behind the obvious: misleading keywords and implicitly abusive language on social media , 2022, Online Soc. Networks Media.

[9]  Xi Victoria Lin,et al.  OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.

[10]  Hannah Rose Kirk,et al.  Handling and Presenting Harmful Text in NLP Research , 2022, EMNLP.

[11]  Min Kyung Lee,et al.  Participatory Design of AI Systems: Opportunities and Challenges Across Diverse Users, Relationships, and Application Domains , 2022, CHI Extended Abstracts.

[12]  Jason Naradowsky,et al.  Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem , 2022, FINDINGS.

[13]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[14]  J. Pierrehumbert,et al.  Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks , 2021, NAACL-HLT.

[15]  Noah A. Smith,et al.  Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection , 2021, NAACL.

[16]  Vinodkumar Prabhakaran,et al.  Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations , 2021, TACL.

[17]  Scott A. Hale,et al.  Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate , 2021, NAACL.

[18]  Weizhu Chen,et al.  DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing , 2021, ICLR.

[19]  Vinodkumar Prabhakaran,et al.  On Releasing Annotator-Level Labels and Information in Datasets , 2021, LAW.

[20]  Julio Gonzalo,et al.  Overview of EXIST 2021: sEXism Identification in Social neTworks , 2021, Proces. del Leng. Natural.

[21]  Savvas Zannettou,et al.  ‘Welcome to #GabFam’: Far-right virtual community on Gab , 2021, New Media Soc..

[22]  Niyati Chhaya,et al.  Categorizing Sexism and Misogyny through Neural Approaches , 2021, ACM Trans. Web.

[23]  Dong Nguyen,et al.  Introducing CAD: the Contextual Abuse Dataset , 2021, NAACL.

[24]  Alina Arseniev-Koehler,et al.  Reconsidering Annotator Disagreement about Racist Language: Noise or Signal? , 2021, SOCIALNLP.

[25]  Zhiyi Ma,et al.  Dynabench: Rethinking Benchmarking in NLP , 2021, NAACL.

[26]  Matteo Cinelli,et al.  The echo chamber effect on social media , 2021, Proceedings of the National Academy of Sciences.

[27]  Douwe Kiela,et al.  Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection , 2021, Annual Meeting of the Association for Computational Linguistics.

[28]  Seid Muhie Yimam,et al.  HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection , 2020, AAAI.

[29]  Tommaso Caselli,et al.  HateBERT: Retraining BERT for Abusive Language Detection in English , 2020, WOAH.

[30]  Jianfeng Gao,et al.  DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[31]  Claudia Wagner,et al.  "Call me sexist, but..." : Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples , 2020, ICWSM.

[32]  Jeremy Blackburn,et al.  The Evolution of the Manosphere across the Web , 2020, ICWSM.

[33]  Leon Derczynski,et al.  Annotating Online Misogyny , 2021, ACL.

[34]  Nishanth R. Sastry,et al.  An Expert Annotated Dataset for the Detection of Online Misogyny , 2021, EACL.

[35]  Edoardo Mosca,et al.  Explainable Abusive Language Classification Leveraging User and Network Data , 2021, ECML/PKDD.

[36]  John Pavlopoulos,et al.  SemEval-2021 Task 5: Toxic Spans Detection , 2021, SEMEVAL.

[37]  Yejin Choi,et al.  Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics , 2020, EMNLP.

[38]  N. Sweeney Not All Dead White Men: Classics and Misogyny in the Digital Age by Donna Zuckerberg (review) , 2020, American Journal of Philology.

[39]  Dat Quoc Nguyen,et al.  BERTweet: A pre-trained language model for English Tweets , 2020, EMNLP.

[40]  Aida Mostafazadeh Davani,et al.  Contextualizing Hate Speech Classifiers with Post-hoc Explanation , 2020, ACL.

[41]  Pierangelo Rosati,et al.  Neologising misogyny: Urban Dictionary’s folksonomies of sexual abuse , 2020, New Media Soc..

[42]  Leon Derczynski,et al.  Directions in abusive language training data, a systematic review: Garbage in, garbage out , 2020, PloS one.

[43]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[44]  Jianfeng Gao,et al.  The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding , 2020, ACL.

[45]  S. Levine,et al.  Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.

[46]  Noah A. Smith,et al.  Social Bias Frames: Reasoning about Social and Power Implications of Language , 2019, ACL.

[47]  S. Wright,et al.  Sluts and soyboys: MGTOW and the production of misogynistic online harassment , 2019, New Media Soc..

[48]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[49]  Ruben L. Bach,et al.  New Data Sources in Social Science Research: Things to Know Before Working With Reddit Data , 2019, Social Science Computer Review.

[50]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[51]  Jing Qian,et al.  A Benchmark Dataset for Learning to Intervene in Online Hate Speech , 2019, EMNLP.

[52]  Scott A. Hale,et al.  Challenges and frontiers in abusive content detection , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[53]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[54]  Harith Alani,et al.  Exploring Misogyny across the Manosphere in Reddit , 2019, WebSci.

[55]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[56]  Alison Ribeiro,et al.  INF-HatEval at SemEval-2019 Task 5: Convolutional Neural Networks for Hate Speech Detection Against Women and Immigrants on Twitter , 2019, *SEMEVAL.

[57]  Preslav Nakov,et al.  SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[58]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[59]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[60]  Kim Barker,et al.  Misogyny , 2018, Online Misogyny as a Hate Crime.

[61]  Emily M. Bender,et al.  Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science , 2018, TACL.

[62]  Aida Mostafazadeh Davani,et al.  The Gab Hate Corpus: A collection of 27k posts annotated for hate speech , 2018 .

[63]  Fabrício Benevenuto,et al.  Inside the Right-Leaning Echo Chambers: Characterizing Gab, an Unmoderated Social System , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[64]  Mai ElSherief,et al.  Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection , 2018, NAACL.

[65]  Gianluca Stringhini,et al.  What is Gab: A Bastion of Free Speech or an Alt-Right Echo Chamber , 2018, WWW.

[66]  Paolo Rosso,et al.  Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI) , 2018, EVALITA@CLiC-it.

[67]  Paolo Rosso,et al.  Overview of the Task on Automatic Misogyny Identification at IberEval 2018 , 2018, IberEval@SEPLN.

[68]  Radhika Mamidi,et al.  When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data , 2017, NLP+CSS@ACL.

[69]  Debbie Ging,et al.  Alphas, Betas, and Incels: Theorizing the Masculinities of the Manosphere , 2017 .

[70]  Mai ElSherief,et al.  #NotOkay: Understanding Gender-Based Violence in Social Media , 2017, ICWSM.

[71]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[72]  Nancy A. Naples,et al.  Digital Media and Gender , 2016 .

[73]  M. Lilly 'The World is Not a Safe Place for Men': The Representational Politics of the Manosphere , 2016 .

[74]  C. Brodsky The Discovery of Grounded Theory: Strategies for Qualitative Research , 1968 .