Classification of Twitter Disaster Data Using a Hybrid Feature-Instance Adaptation Approach

Huge amounts of data that are generated on social media during emergency situations are regarded as troves of critical information. The use of supervised machine learning techniques in the early stages of a disaster is challenged by the lack of labeled data for that particular disaster. Furthermore, supervised models trained on labeled data from a prior disaster may not produce accurate results, given the inherent variation between the current and the prior disasters. To address the challenges posed by the lack of labeled data for a target disaster, we propose to use a hybrid feature-instance adaptation approach based on matrix factorization and the k-nearest neighbors algorithm, respectively. The proposed hybrid adaptation approach is used to select a subset of the source disaster data that is representative for the target disaster. The selected subset is subsequently used to learn accurate Naïve Bayes classifiers for the target disaster.

[1]  Muhammad Imran,et al.  A Robust Framework for Classifying Evolving Document Streams in an Expert-Machine-Crowd Setting , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[2]  Fernando Diaz,et al.  Emergency-relief coordination on social media: Automatically matching resource requests and offers , 2013, First Monday.

[3]  Cornelia Caragea,et al.  Disaster Response Aided by Tweet Classification with a Domain Adaptation Approach , 2018 .

[4]  Weiwei Guo,et al.  Weiwei: A Simple Unsupervised Latent Semantics based Approach for Sentence Similarity , 2012, SemEval@NAACL-HLT.

[5]  Cornelia Caragea,et al.  Twitter Mining for Disaster Response: A Domain Adaptation Approach , 2015, ISCRAM.

[6]  Anirban Sen,et al.  Extracting situational awareness from microblogs during disaster events , 2015, 2015 7th International Conference on Communication Systems and Networks (COMSNETS).

[7]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[8]  Hassan Sajjad,et al.  Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks , 2017, ICWSM.

[9]  John Yen,et al.  Classifying text messages for the haiti earthquake , 2011, ISCRAM.

[10]  Shady Elbassuoni,et al.  Practical extraction of disaster-relevant information from social media , 2013, WWW.

[11]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Ghazaleh Beigi,et al.  An Overview of Sentiment Analysis in Social Media and Its Applications in Disaster Relief , 2016, Sentiment Analysis and Ontology Engineering.

[13]  Leysia Palen,et al.  Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency , 2011, ICWSM.

[14]  James J. Jiang A Literature Survey on Domain Adaptation of Statistical Classifiers , 2007 .

[15]  Cornelia Caragea,et al.  Identifying Informative Messages in Disasters using Convolutional Neural Networks , 2016, ISCRAM.

[16]  Cornelia Caragea,et al.  Mapping moods: Geo-mapped sentiment analysis during hurricane sandy , 2014, ISCRAM.

[17]  R.J.P. Stronkman,et al.  Towards a realtime Twitter analysis during crises for operational crisis management , 2012, ISCRAM.

[18]  Sarah Vieweg,et al.  Situational Awareness in Mass Emergency: A Behavioral and Linguistic Analysis of Microblogged Communications , 2012 .

[19]  Rachel L. Finn,et al.  Organizational and Societal Impacts of Big Data in Crisis Management , 2017 .

[20]  L. Palen,et al.  Crisis informatics—New data for extraordinary times , 2016, Science.

[21]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  Muhammad Imran,et al.  Cross-Language Domain Adaptation for Classifying Crisis-Related Short Messages , 2016, ISCRAM.

[24]  Aron Culotta,et al.  Tweedr: Mining twitter to inform disaster response , 2014, ISCRAM.

[25]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[26]  Raihan Ur Rasool,et al.  Crisis analytics: big data-driven crisis response , 2016, Journal of International Humanitarian Action.

[27]  Cornelia Caragea,et al.  Towards Practical Usage of a Domain Adaptation Algorithm in the Early Hours of a Disaster , 2017, ISCRAM.

[28]  Anushree Dave,et al.  Digital Humanitarians: How Big Data Is Changing the Face of Humanitarian Response , 2017, Journal of Bioethical Inquiry.

[29]  Qunying Huang,et al.  Geographic Situational Awareness: Mining Tweets for Disaster Preparedness, Emergency Response, Impact, and Recovery , 2015, ISPRS Int. J. Geo Inf..