Deep CNN-LSTM with Word Embeddings for News Headline Sarcasm Detection

Detecting sarcasm has been a problem in Artificial Intelligence (AI) because it is highly dependent on context and a human agent’s knowledge of the world. This means that for conventional AI, a large number of rules must be hard coded. Furthermore, naive machine learning methods such as logistic regression simply generate lists of words that are frequently associated and dissociated with sarcasm. Thus, logistic regression is unable to take groupings of words into account in sentiment analysis problems. In this paper, we design a deep neural network that leverages the advantages of convolutional neural networks (CNN) and Long Short-term Memory layers. This CNN-LSTM neural network with word embeddings is then trained 21,709 on word vector encodings of news headlines to determine whether a headline is sarcastic or genuine. We then applied this neural network to a corpus of 5000 test examples of real and sarcastic news headlines. This Deep CNN-LSTM neural network architecture can classify whether a news headline is real or satirical with 86.16% accuracy.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Muhammad,et al.  A Rule Based System for Speech Language Context Understanding , 2006 .

[3]  Pushpak Bhattacharyya,et al.  Automatic Sarcasm Detection , 2016, ACM Comput. Surv..

[4]  Elisabetta Fersini,et al.  Sentiment Analysis in Social Networks , 2016 .

[5]  Lamia Rahman,et al.  A new LSTM model by introducing biological cell state , 2016, 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[6]  Wil M. P. van der Aalst,et al.  Business Process Variability Modeling , 2017, ACM Comput. Surv..