A Twitter Data Credibility Framework - Hurricane Harvey as a Use Case

Social media data have been used to improve geographic situation awareness in the past decade. Although they have free and openly availability advantages, only a small proportion is related to situation awareness, and reliability or trustworthiness is a challenge. A credibility framework is proposed for Twitter data in the context of disaster situation awareness. The framework is derived from crowdsourcing, which states that errors propagated in volunteered information decrease as the number of contributors increases. In the proposed framework, credibility is hierarchically assessed on two tweet levels. The framework was tested using Hurricane Harvey Twitter data, in which situation awareness related tweets were extracted using a set of predefined keywords including power, shelter, damage, casualty, and flood. For each tweet, text messages and associated URLs were integrated to enhance the information completeness. Events were identified by aggregating tweets based on their topics and spatiotemporal characteristics. Credibility for events was calculated and analyzed against the spatial, temporal, and social impacting scales. This framework has the potential to calculate the evolving credibility in real time, providing users insight on the most important and trustworthy events.

[1]  Stuart E. Middleton,et al.  Real-Time Crisis Mapping of Natural Disasters Using Social Media , 2014, IEEE Intelligent Systems.

[2]  W. F. Athas,et al.  Evaluating cluster alarms: a space-time scan statistic and brain cancer in Los Alamos, New Mexico. , 1998, American journal of public health.

[3]  David S. Ebert,et al.  Public behavior response analysis in disaster events utilizing visual analytics of microblog data , 2014, Comput. Graph..

[4]  Hansi Senaratne,et al.  A review of volunteered geographic information quality assessment methods , 2017, Int. J. Geogr. Inf. Sci..

[5]  Han Qin,et al.  Geocrowdsourcing and accessibility for dynamic environments , 2016 .

[6]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[7]  D. Murthy,et al.  Social media processes in disasters: Implications of emergent technology use. , 2017, Social science research.

[8]  Shu-Ching Chen,et al.  Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep Learning , 2017, Int. J. Semantic Comput..

[9]  Manzhu Yu,et al.  Big Data in Natural Disaster Management: A Review , 2018 .

[10]  Zhenlong Li,et al.  Big Data and cloud computing: innovation opportunities and challenges , 2017, Int. J. Digit. Earth.

[11]  Christian Reuter,et al.  Retrospective Review and Future Directions for Crisis Informatics , 2021, Information Refinement Technologies for Crisis Informatics.

[12]  Barbara Poblete,et al.  Predicting information credibility in time-sensitive social media , 2013, Internet Res..

[13]  Michael F. Goodchild,et al.  Assuring the quality of volunteered geographic information , 2012 .

[14]  C. Havas,et al.  Combining machine-learning topic models and spatiotemporal analysis of social media data for disaster footprint and damage assessment , 2018 .

[15]  Qunying Huang,et al.  Geographic Situational Awareness: Mining Tweets for Disaster Preparedness, Emergency Response, Impact, and Recovery , 2015, ISPRS Int. J. Geo Inf..

[16]  Wenwen Li,et al.  Constructing gazetteers from volunteered Big Geo-Data based on Hadoop , 2013, Comput. Environ. Urban Syst..

[17]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[18]  M. Kulldorff,et al.  A Space–Time Permutation Scan Statistic for Disease Outbreak Detection , 2005, PLoS medicine.

[19]  Shan Ling Pan,et al.  Digitally enabled disaster response: the emergence of social media as boundary objects in a flooding disaster , 2017, Inf. Syst. J..

[20]  Abbas Rajabifard,et al.  A Multi-Element Approach to Location Inference of Twitter: A Case for Emergency Response , 2016, ISPRS Int. J. Geo Inf..

[21]  Karen Neville,et al.  Communication in a disaster - the development of a crisis communication tool within the S-HELP project , 2016, J. Decis. Syst..

[22]  Grant Blank The Digital Divide Among Twitter Users and Its Implications for Social Research , 2017 .

[23]  Matthew Zook,et al.  Mapping the Data Shadows of Hurricane Sandy: Uncovering the Sociospatial Dimensions of ‘Big Data’ , 2014 .