Clickbait Detection in Tweets Using Self-attentive Network

Clickbait detection in tweets remains an elusive challenge. In this paper, we describe the solution for the Zingel Clickbait Detector at the Clickbait Challenge 2017, which is capable of evaluating each tweet's level of click baiting. We first reformat the regression problem as a multi-classification problem, based on the annotation scheme. To perform multi-classification, we apply a token-level, self-attentive mechanism on the hidden states of bi-directional Gated Recurrent Units (biGRU), which enables the model to generate tweets' task-specific vector representations by attending to important tokens. The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017.

[1]  Naeemul Hassan,et al.  Diving Deep into Clickbaits: Who Use Them to What Extents in Which Topics with What Effects? , 2017, ASONAM.

[2]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[3]  Abhijnan Chakraborty,et al.  Tabloids in the Era of Social Media? Understanding the Production and Consumption of Clickbaits in Twitter , 2017 .

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Niloy Ganguly,et al.  Stop Clickbait: Detecting and preventing clickbaits in online news media , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[6]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[7]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[8]  Daoud M. Daoud,et al.  Clickbait Detection , 2018, ICSIE '18.

[9]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10]  G. Loewenstein The psychology of curiosity: A review and reinterpretation. , 1994 .

[11]  Matthias Hagen,et al.  Clickbait Detection , 2016, ECIR.

[12]  Shu-Tao Xia,et al.  Boost Clickbait Detection Based on User Behavior Analysis , 2017, APWeb/WAIM.

[13]  Amol Agrawal,et al.  Clickbait detection using deep learning , 2016, 2016 2nd International Conference on Next Generation Computing Technologies (NGCT).

[14]  Matthias Hagen,et al.  The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength , 2018, ArXiv.

[15]  Sahila Chopra,et al.  Towards Automatic Identification of Fake News: Headline-Article Stance Detection with LSTM Attention Models , 2017 .

[16]  John G. Breslin,et al.  A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis , 2016, EMNLP.

[17]  Lei Shi,et al.  Connecting Targets to Tweets: Semantic Attention-Based Model for Target-Specific Stance Detection , 2017, WISE.

[18]  Yimin Chen,et al.  Misleading Online Content: Recognizing Clickbait as "False News" , 2015, WMDD@ICMI.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[21]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[22]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[23]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[24]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[25]  Matthias Hagen,et al.  Crowdsourcing a Large Corpus of Clickbait on Twitter , 2018, COLING.

[26]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[27]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[28]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[29]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[30]  Prakhar Biyani,et al.  "8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality , 2016, AAAI.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33]  Benno Stein,et al.  Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.

[34]  Tanmoy Chakraborty,et al.  We Used Neural Networks to Detect Clickbaits: You Won't Believe What Happened Next! , 2016, ECIR.

[35]  Xiaojun Wan,et al.  Learning to Identify Ambiguous and Misleading News Headlines , 2017, IJCAI.