ColBERT: Using BERT Sentence Embedding for Humor Detection

Automatic humor detection has interesting use cases in modern technologies, such as chatbots and personal assistants. In this paper, we describe a novel approach for detecting humor in short texts using BERT sentence embedding. Our proposed model uses BERT to generate tokens and sentence embedding for texts. It sends embedding outputs as input to a two-layered neural network that predicts the target value. For evaluation, we created a new dataset for humor detection consisting of 200k formal short texts (100k positive, 100k negative). Experimental results show an accuracy of 98.1 percent for the proposed method, 2.1 percent improvement compared to the best CNN and RNN models and 1.1 percent better than a fine-tuned BERT model. In addition, the combination of RNN-CNN was not successful in this task compared to the CNN model.

[1]  Luis Chiruzzo,et al.  Overview of HAHA at IberLEF 2019: Humor Analysis based on Human Annotation , 2019, IberLEF@SEPLN.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Julia Hirschberg,et al.  Predicting Humor by Learning from Time-Aligned Comments , 2019, INTERSPEECH.

[4]  Julia Taylor Rayz,et al.  Computationally Recognizing Wordplay in Jokes , 2004 .

[5]  Guillermo Moncecchi,et al.  Is This a Joke? Detecting Humor in Spanish Tweets , 2016, IBERAMIA.

[6]  Diane J. Litman,et al.  Humor: Prosody Analysis and Automatic Recognition for F*R*I*E*N*D*S* , 2006, EMNLP.

[7]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[8]  Manish Shrivastava,et al.  Gender Prediction in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System , 2018, Computación y Sistemas.

[9]  Diyi Yang,et al.  Humor Recognition and Humor Anchor Extraction , 2015, EMNLP.

[10]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11]  Adilzhan Ismailov,et al.  Humor Analysis Based on Human Annotation Challenge at IberLEF 2019: First-place Solution , 2019, IberLEF@SEPLN.

[12]  Manish Shrivastava,et al.  Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System , 2018, LREC.

[13]  Cordelia Schmid,et al.  VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[15]  Louis-Philippe Morency,et al.  UR-FUNNY: A Multimodal Language Dataset for Understanding Humor , 2019, EMNLP.

[16]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[17]  Pascale Fung,et al.  Deep Learning of Audio and Language Features for Humor Prediction , 2016, LREC.

[18]  Haizhou Li,et al.  Making Social Robots More Attractive: The Effects of Voice Pitch, Humor and Empathy , 2013, Int. J. Soc. Robotics.

[19]  Cade McCall,et al.  Does it matter if a computer jokes , 2011, CHI Extended Abstracts.

[20]  Von-Wun Soo,et al.  Humor Recognition Using Deep Learning , 2018, NAACL.

[21]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[22]  Pavel Braslavski,et al.  A Pinch of Humor for Short-Text Conversation: An Information Retrieval Approach , 2017, CLEF.

[23]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[24]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[25]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Valentino Giudice,et al.  Aspie96 at HAHA (IberLEF 2019): Humor Detection in Spanish Tweets with Character-Level Convolutional RNN , 2019, IberLEF@SEPLN.

[27]  Lei Chen,et al.  Predicting Audience's Laughter During Presentations Using Convolutional Neural Network , 2017, BEA@EMNLP.

[28]  Kevin Seppi,et al.  Humor Detection: A Transformer Gets the Last Laugh , 2019, EMNLP.