SemEval 2021 Task 7: HaHackathon, Detecting and Rating Humor and Offense

SemEval 2021 Task 7, HaHackathon, was the first shared task to combine the previously separate domains of humor detection and offense detection. We collected 10,000 texts from Twitter and the Kaggle Short Jokes dataset, and had each annotated for humor and offense by 20 annotators aged 18-70. Our subtasks were binary humor detection, prediction of humor and offense ratings, and a novel controversy task: to predict if the variance in the humor ratings was higher than a specific threshold. The subtasks attracted 36-58 submissions, with most of the participants choosing to use pre-trained language models. Many of the highest performing teams also implemented additional optimization techniques, including task-adaptive training and adversarial training. The results suggest that the participating systems are well suited to humor detection, but that humor controversy is a more challenging task. We discuss which models excel in this task, which auxiliary techniques boost their performance, and analyze the errors which were not captured by the best systems.

[1]  Shuohuan Wang,et al.  abcbpc at SemEval-2021 Task 7: ERNIE-based Multi-task Model for Detecting and Rating Humor and Offense , 2021, SEMEVAL.

[2]  Emran Al-Bashabsheh,et al.  ES-JUST at SemEval-2021 Task 7: Detecting and Rating Humor and Offensive Text Using Deep Learning , 2021, SemEval@ACL/IJCNLP.

[3]  Malak Abdullah,et al.  SarcasmDet at SemEval-2021 Task 7: Detect Humor and Offensive based on Demographic Factors using RoBERTa Pre-trained Model , 2021, SEMEVAL.

[4]  Xiaobing Zhou,et al.  Tsia at SemEval-2021 Task 7: Detecting and Rating Humor and Offense , 2021, SEMEVAL.

[5]  Ismail Berrada,et al.  CS-UM6P at SemEval-2021 Task 7: Deep Multi-Task Learning Model for Detecting and Rating Humor and Offense , 2021, SEMEVAL.

[6]  Dat Quoc Nguyen,et al.  BERTweet: A pre-trained language model for English Tweets , 2020, EMNLP.

[7]  Carlo Strapparava,et al.  LEARNING TO LAUGH (AUTOMATICALLY): COMPUTATIONAL MODELS FOR HUMOR RECOGNITION , 2006, Comput. Intell..

[8]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[9]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[10]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[11]  Luis Chiruzzo,et al.  Overview of HAHA at IberLEF 2019: Humor Analysis based on Human Annotation , 2019, IberLEF@SEPLN.

[12]  Julia Taylor Rayz,et al.  Computationally Recognizing Wordplay in Jokes , 2004 .

[13]  Avi Arampatzis,et al.  DUTH at SemEval-2021 Task 7: Is Conventional Machine Learning for Humorous and Offensive Tasks enough in 2021? , 2021, SEMEVAL.

[14]  Salvatore Attardo,et al.  A primer for the linguistics of humor , 2008 .

[15]  Tanya Golash‐Boza,et al.  ‘It was only a joke’: how racial humour fuels colour-blind ideologies in Mexico and Peru , 2013 .

[16]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[17]  Luis Chiruzzo,et al.  Overview of the HAHA Task: Humor Analysis Based on Human Annotation at IberEval 2018 , 2018, IberEval@SEPLN.

[18]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[19]  Viviana Patti,et al.  Hurtlex: A Multilingual Lexicon of Words to Hurt , 2018, CLiC-it.

[20]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[21]  Jianfeng Gao,et al.  DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[22]  Julia M. Taylor Computational Treatments of Humor , 2017 .

[23]  Brian Zylich,et al.  Amherst685 at SemEval-2021 Task 7: Joint Modeling of Classification and Regression for Humor and Offense , 2021, SEMEVAL.

[24]  Miriam Amin,et al.  A Survey on Approaches to Computational Humor Generation , 2020, LATECHCLFL.

[25]  Dumitru-Clementin Cercel,et al.  UPB at SemEval-2021 Task 7: Adversarial Multi-Task Learning for Detecting and Rating Humor and Offense , 2021, SEMEVAL.

[26]  Pearl Pu,et al.  HumorHunter at SemEval-2021 Task 7: Humor and Offense Recognition with Disentangled Attention , 2021, SEMEVAL.

[27]  Anna Rumshisky,et al.  SemEval-2017 Task 6: #HashtagWars: Learning a Sense of Humor , 2017, *SEMEVAL.

[28]  Ilanthenral Kandasamy,et al.  YoungSheldon at SemEval-2021 Task 7: Fine-tuning Is All You Need , 2021, SEMEVAL.

[29]  Lianxin Jiang,et al.  MagicPai at SemEval-2021 Task 7: Method for Detecting and Rating Humor Based on Multi-Task Adversarial Training , 2021, SEMEVAL.

[30]  Henry A. Kautz,et al.  SemEval-2020 Task 7: Assessing Humor in Edited News Headlines , 2020, SEMEVAL.

[31]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32]  Zhipeng Luo,et al.  DeepBlueAI at SemEval-2021 Task 7: Detecting and Rating Humor and Offense with Stacking Diverse Language Model-Based Methods , 2021, SEMEVAL.

[33]  T. Platt,et al.  Gender differences in humor-related traits, humor appreciation, production, comprehension, (neural) responses, use, and correlates: A systematic review , 2020, Current Psychology.

[34]  J. A Meaney Crossing the Line: Where do Demographic Variables Fit into Humor Detection? , 2020, ACL.

[35]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[36]  Tim Highfield Tweeted Joke Life Spans and Appropriated Punch Lines: Practices Around Topical Humor on Social Media , 2015 .

[37]  Rehab Duwairi,et al.  DLJUST at SemEval-2021 Task 7: Hahackathon: Linking Humor and Offense , 2021, SEMEVAL.

[38]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[39]  Hiroaki Ozaki,et al.  Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition , 2020, SEMEVAL.

[40]  Michael Pickering,et al.  Beyond a joke : the limits of humour , 2005 .

[41]  Renxian Zhang,et al.  Recognizing Humor on Twitter , 2014, CIKM.

[42]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[43]  Ashutosh Modi,et al.  Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness , 2021, SEMEVAL.

[44]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[45]  G. Kuipers Chapter 4. The humor divide: Class, age and humor styles , 2006 .

[46]  W. Ruch The sense of humor : Explorations of a personality characteristic , 1998 .

[47]  Julia Taylor Rayz In pursuit of human-friendly interaction with a computational system: Computational humor , 2017, 2017 IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI).