A Sentiment-Aware Contextual Model for Real-Time Disaster Prediction Using Twitter Data

The massive amount of data generated by social media present a unique opportunity for disaster analysis. As a leading social platform, Twitter generates over 500 million Tweets each day. Due to its real-time characteristic, more agencies employ Twitter to track disaster events to make a speedy rescue plan. However, it is challenging to build an accurate predictive model to identify disaster Tweets, which may lack sufficient context due to the length limit. In addition, disaster Tweets and regular ones can be hard to distinguish because of word ambiguity. In this paper, we propose a sentiment-aware contextual model named SentiBERT-BiLSTM-CNN for disaster detection using Tweets. The proposed learning pipeline consists of SentiBERT that can generate sentimental contextual embeddings from a Tweet, a Bidirectional long short-term memory (BiLSTM) layer with attention, and a 1D convolutional layer for local feature extraction. We conduct extensive experiments to validate certain design choices of the model and compare our model with its peers. Results show that the proposed SentiBERT-BiLSTM-CNN demonstrates superior performance in the F1 score, making it a competitive model in Tweets-based disaster prediction.

[1]  Björn Gambäck,et al.  Twitter Topic Modeling by Tweet Aggregation , 2017, NODALIDA.

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Mark Dredze,et al.  Separating Fact from Fear: Tracking Flu Infections on Twitter , 2013, NAACL.

[4]  Classification of Disaster Specific Tweets - A Hybrid Approach , 2021, 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom).

[5]  Marina Kogan,et al.  Think Local, Retweet Global: Retweeting by the Geographically-Vulnerable during Hurricane Sandy , 2015, CSCW.

[6]  Liyana Shuib,et al.  Context-Based Feature Technique for Sarcasm Identification in Benchmark Datasets Using Deep Learning and BERT Model , 2021, IEEE Access.

[7]  Girish Keshav Palshikar,et al.  Weakly Supervised and Online Learning of Word Models for Classification to Detect Disaster Reporting Tweets , 2018, Information Systems Frontiers.

[8]  Bu-Sung Lee,et al.  TwiNER: named entity recognition in targeted twitter stream , 2012, SIGIR '12.

[9]  Ramakanth Kavuluru,et al.  Convolutional neural networks for biomedical text classification: application in indexing biomedical articles , 2015, BCB.

[10]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[11]  Leysia Palen,et al.  Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency , 2011, ICWSM.

[12]  Tao Chen,et al.  Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN , 2017, Expert Syst. Appl..

[13]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[14]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[15]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[16]  Tom M. Mitchell,et al.  Weakly Supervised Extraction of Computer Security Events from Twitter , 2015, WWW.

[17]  Anthony N. Nguyen,et al.  Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods , 2017, BioNLP.

[18]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[19]  Spyros Kotoulas,et al.  Medical Text Classification using Convolutional Neural Networks , 2017, Studies in health technology and informatics.

[20]  Renato Stoffalette João,et al.  On Informative Tweet Identification For Tracking Mass Events , 2021, ICAART.

[21]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[22]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[23]  Yiming Yang,et al.  Deep Learning for Extreme Multi-label Text Classification , 2017, SIGIR.

[24]  Firoj Alam,et al.  HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks , 2021, ICWSM.

[25]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[26]  Firoj Alam,et al.  CrisisMMD: Multimodal Twitter Datasets from Natural Disasters , 2018, ICWSM.

[27]  Jianfeng Gao,et al.  Adversarial Training for Large Neural Language Models , 2020, ArXiv.

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Cornelia Caragea,et al.  Disaster Response Aided by Tweet Classification with a Domain Adaptation Approach , 2018 .

[30]  ChengXiang Zhai,et al.  DeepMeSH: deep semantic representation for improving large-scale MeSH indexing , 2016, Bioinform..

[31]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[32]  Laure Soulier,et al.  Answering Twitter Questions: a Model for Recommending Answerers through Social Collaboration , 2016, CIKM.

[33]  Imran Razzak,et al.  EveSense: What Can You Sense from Twitter? , 2020, ECIR.

[34]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[35]  Degen Huang,et al.  A Study of Multilingual Toxic Text Detection Approaches under Imbalanced Sample Distribution , 2021, Inf..

[36]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[37]  Rohit Kumar Kaliyar,et al.  FakeBERT: Fake news detection in social media with a BERT-based deep learning approach , 2021, Multim. Tools Appl..

[38]  Ali Moeini,et al.  Real-time Event Detection in Twitter: A Case Study , 2020, 2020 6th International Conference on Web Research (ICWR).

[39]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[40]  Jiuyong Li,et al.  Leveraging burst in twitter network communities for event detection , 2020, World Wide Web.

[41]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[42]  Tao Meng,et al.  SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics , 2020, ACL.

[43]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[44]  Hakan Ferhatosmanoglu,et al.  Short text classification in twitter to improve information filtering , 2010, SIGIR.

[45]  Meng Wang,et al.  Cross-Domain Sentiment Encoding through Stochastic Word Embedding , 2020, IEEE Transactions on Knowledge and Data Engineering.

[46]  Ahmed Abdelali,et al.  Bert Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets , 2020, WANLP.

[47]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[48]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[49]  Leysia Palen,et al.  Identifying and Categorizing Disaster-Related Tweets , 2016, SocialNLP@EMNLP.

[50]  Yogesh Kumar Dwivedi,et al.  Event classification and location prediction from tweets during disasters , 2017, Annals of Operations Research.

[51]  Noel Crespi,et al.  A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media , 2019, COMPLEX NETWORKS.

[52]  Madichetty Sreenivasulu,et al.  Detecting Informative Tweets during Disaster using Deep Neural Networks , 2019, 2019 11th International Conference on Communication Systems & Networks (COMSNETS).

[53]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[54]  Jin Wang,et al.  Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification , 2017, IJCAI.