A BERT-based Approach for Automatic Humor Detection and Scoring

In this paper we report our participation in the 2019 HAHA task where a corpus of crowd-annotated tweets is provided and required to tell if a tweet is a joke or not and predict a funniness score value for a tweet. Our approach utilizes BERT, a multi-layer bidirectional transformer encoder which can help learn deep bi-directional representations, and the pretrained model is fine-tuned on training data for HAHA task. The representation of a tweet is fed into an output layer for classification. To predict the funniness score, we apply another output layer to generate scores by using float labels and train it with the mean squad error between the prediction scores and the labels. Our best F-Score on the test set for Task 1 is 0.784 and RMSE for Task 2 is 0.910. We find that our approach is competitive and applicable to multilingual text classification tasks.

[1]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[2]  Guillermo Moncecchi,et al.  Is This a Joke? Detecting Humor in Spanish Tweets , 2016, IBERAMIA.

[3]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4]  Paolo Rosso,et al.  A multidimensional approach for detecting irony in Twitter , 2013, Lang. Resour. Evaluation.

[5]  Luis Chiruzzo,et al.  Overview of HAHA at IberLEF 2019: Humor Analysis based on Human Annotation , 2019, IberLEF@SEPLN.

[6]  Kenji Araki,et al.  Recognizing Humor Without Recognizing Meaning , 2007, WILF.

[7]  Li Zhao,et al.  Learning Structured Representation for Text Classification via Reinforcement Learning , 2018, AAAI.

[8]  Lizhen Liu,et al.  Investigations in Automatic Humor Recognition , 2017, 2017 10th International Symposium on Computational Intelligence and Design (ISCID).

[9]  Paolo Rosso,et al.  A Multilevel Approach to Sentiment Analysis of Figurative Language in Twitter , 2016, CICLing.

[10]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[11]  S. Attardo Linguistic theories of humor , 1994 .

[12]  Paolo Rosso,et al.  SemEval-2015 Task 11: Sentiment Analysis of Figurative Language in Twitter , 2015, *SEMEVAL.

[13]  Yishay Raz,et al.  Automatic Humor Classification on Twitter , 2012, NAACL.

[14]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.