SUTNLP at SemEval-2023 Task 10: RLAT-Transformer for explainable online sexism detection

There is no simple definition of sexism, butit can be described as prejudice, stereotyping,or discrimination, especially against women,based on their gender. In online interactions,sexism is common. One out of ten Americanadults says that they have been harassed be-cause of their gender and have been the targetof sexism, so sexism is a growing issue. TheExplainable Detection of Online Sexism sharedtask in SemEval-2023 aims at building sexismdetection systems for the English language. Inorder to address the problem, we use largelanguage models such as RoBERTa and De-BERTa. In addition, we present Random LayerAdversarial Training (RLAT) for transformers,and show its significant impact on solving allsubtasks. Moreover, we use virtual adversar-ial training and contrastive learning to improveperformance on subtask A. Upon completionof subtask A, B, and C test sets, we obtainedmacro-F1 of 84.45, 67.78, and 52.52, respec-tively outperforming proposed baselines on allsubtasks. Our code is publicly available onGithub.

[1]  Hannah Rose Kirk,et al.  SemEval-2023 Task 10: Explainable Detection of Online Sexism , 2023, SEMEVAL.

[2]  Houfeng Wang,et al.  Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification , 2022, ACL.

[3]  Aakash Kaku,et al.  Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning , 2021, NeurIPS.

[4]  Scott A. Hale,et al.  Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate , 2021, NAACL.

[5]  Danqi Chen,et al.  SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.

[6]  Shouling Ji,et al.  Constructing Contrastive samples via Summarization for Text Classification with limited annotations , 2021, EMNLP.

[7]  Yang Zhang,et al.  AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21 , 2021, SDU@AAAI.

[8]  Douwe Kiela,et al.  Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection , 2021, Annual Meeting of the Association for Computational Linguistics.

[9]  Gerard de Melo,et al.  Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification , 2020, SIGIR.

[10]  Xinyue Liu,et al.  SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling , 2020, ACL.

[11]  Jianfeng Gao,et al.  DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[12]  Eduardo C. Garrido-Merch'an,et al.  Comparing BERT against traditional machine learning text classification , 2020, ArXiv.

[13]  Claudia Wagner,et al.  "Call me sexist, but..." : Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples , 2020, ICWSM.

[14]  Jianfeng Gao,et al.  Adversarial Training for Large Neural Language Models , 2020, ArXiv.

[15]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[16]  Jianfeng Gao,et al.  SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization , 2019, ACL.

[17]  Vasudeva Varma,et al.  Multi-label Categorization of Accounts of Sexism using a Neural Framework , 2019, EMNLP.

[18]  T. Goldstein,et al.  FreeLB: Enhanced Adversarial Training for Natural Language Understanding , 2019, ICLR.

[19]  Shijie Chen,et al.  Technical report on Conversational Question Answering , 2019, ArXiv.

[20]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[21]  Ramit Sawhney,et al.  #YouToo? Detection of Personal Recollections of Sexual Harassment on Social Media , 2019, ACL.

[22]  Mai ElSherief,et al.  Mitigating Gender Bias in Natural Language Processing: Literature Review , 2019, ACL.

[23]  Dilin Wang,et al.  Improving Neural Language Modeling via Adversarial Training , 2019, ICML.

[24]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[25]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[26]  David Bamman,et al.  Adversarial Training for Relation Extraction , 2017, EMNLP.

[27]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[28]  Swami Sankaranarayanan,et al.  Regularizing deep networks using efficient layerwise adversarial training , 2017, AAAI.

[29]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[31]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[32]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[33]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[34]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[35]  P. Bellot,et al.  Adapting Transformers for Multi-Label Text Classification , 2022, CIRCLE.

[36]  Hao Zhang,et al.  GUTS at SemEval-2022 Task 4: Adversarial Training and Balancing Methods for Patronizing and Condescending Language Detection , 2022, SEMEVAL.

[37]  Xuange Cui,et al.  ZhichunRoad at SemEval-2022 Task 2: Adversarial Training and Contrastive Learning for Multiword Representations , 2022, SEMEVAL.

[38]  Rui Zhang,et al.  Contrastive Data and Learning for Natural Language Processing , 2022, NAACL.

[39]  Laura Plaza,et al.  Automatic Classification of Sexism in Social Networks: An Empirical Study on Twitter Data , 2020, IEEE Access.

[40]  Pengtao Xie,et al.  CERT: Contrastive Self-supervised Learning for Language Understanding , 2020, ArXiv.

[41]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[42]  Melissa J. Ferguson,et al.  Everyday Sexism: Evidence for Its Incidence, Nature, and Psychological Impact From Three Daily Diary Studies , 2001 .