A cloud-enabled automatic disaster analysis system of multi-sourced data streams: An example synthesizing social media, remote sensing and Wikipedia data

Abstract Social media streams and remote sensing data have emerged as new sources for tracking disaster events, and assessing their damages. Previous studies focus on a case-by-case approach, where a specific event was first chosen and filtering criteria (e.g., keywords, spatiotemporal information) are manually designed and used to retrieve relevant data for disaster analysis. This paper presents a framework that synthesizes multi-sourced data (e.g., social media, remote sensing, Wikipedia, and Web), spatial data mining and text mining technologies to build an architecturally resilient and elastic solution to support disaster analysis of historical and future events. Within the proposed framework, Wikipedia is used as a primary source of different historical disaster events, which are extracted to build an event database. Such a database characterizes the salient spatiotemporal patterns and characteristics of each type of disaster. Additionally, it can provide basic semantics, such as event name (e.g., Hurricane Sandy) and type (e.g., flooding) and spatiotemporal scopes, which are then tuned by the proposed procedures to extract additional information (e.g., hashtags for searching tweets), to query and retrieve relevant social media and remote sensing data for a specific disaster. Besides historical event analysis and pattern mining, the cloud-based framework can also support real-time event tracking and monitoring by providing on-demand and elastic computing power and storage capabilities. A prototype is implemented and tested with data relative to the 2011 Hurricane Sandy and the 2013 Colorado flooding.

[1]  Qunying Huang,et al.  Usage of Social Media and Cloud Computing During Natural Hazards , 2016, CloudCom 2016.

[2]  Qunying Huang,et al.  A data-driven framework for archiving and exploring social media data , 2014, Ann. GIS.

[3]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[4]  Susan L. Cutter,et al.  GI Science, Disasters, and Emergency Management , 2003, Trans. GIS.

[5]  Qunying Huang,et al.  DisasterMapper: A CyberGIS framework for disaster management using social media data , 2015, BigSpatial@SIGSPATIAL.

[6]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[7]  Nigel Waters,et al.  Real Time Estimation of the Calgary Floods Using Limited Remote Sensing Data , 2014 .

[8]  Guido Cervone,et al.  Damage Assessment of the 2011 Japanese Tsunami Using High-Resolution Satellite Data , 2011, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[9]  Bruno Simões,et al.  Big data through cross-platform interest-based interactivity , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[10]  T YangLaurence,et al.  A nodes scheduling model based on Markov chain prediction for big streaming data analysis , 2015 .

[11]  Bertrand De Longueville,et al.  "OMG, from here, I can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires , 2009, LBSN '09.

[12]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[13]  Wenwu Tang,et al.  A cyber-enabled spatial decision support system to inventory Mangroves in Mozambique: coupling scientific workflows and cloud computing , 2017, Int. J. Geogr. Inf. Sci..

[14]  Emily Schnebele,et al.  Improving remote sensing flood assessment using volunteered geographical data , 2013 .

[15]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[16]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[17]  Wenwen Li,et al.  Constructing gazetteers from volunteered Big Geo-Data based on Hadoop , 2013, Comput. Environ. Urban Syst..

[18]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[19]  Qunying Huang,et al.  Evaluating open-source cloud computing solutions for geosciences , 2013, Comput. Geosci..

[20]  Wenwu Tang,et al.  Parallel map projection of vector-based big spatial data: Coupling cloud computing with graphics processing units , 2017, Comput. Environ. Urban Syst..

[21]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[22]  Shady Elbassuoni,et al.  Practical extraction of disaster-relevant information from social media , 2013, WWW.

[23]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24]  Laurence T. Yang,et al.  A nodes scheduling model based on Markov chain prediction for big streaming data analysis , 2015, Int. J. Commun. Syst..

[25]  Sergey V. Samsonov,et al.  A review of the status of satellite remote sensing and image processing techniques for mapping natural hazards and disasters , 2009 .

[26]  Gennady L. Andrienko,et al.  Tracing the German centennial flood in the stream of tweets: first lessons learned , 2013, GEOCROWD '13.

[27]  Qunying Huang,et al.  Using Twitter for tasking remote-sensing data collection and damage assessment: 2013 Boulder flood case study , 2016 .

[28]  Aron Culotta,et al.  Tweedr: Mining twitter to inform disaster response , 2014, ISCRAM.

[29]  Mohammad Ali Abbasi,et al.  TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief , 2011, ICWSM.

[30]  Wenwu Tang,et al.  Global Sensitivity Analysis of a Large Agent-Based Model of Spatial Opinion Exchange: A Heterogeneous Multi-GPU Acceleration Approach , 2014 .

[31]  A. Culotta,et al.  A Demographic Analysis of Online Sentiment during Hurricane Irene , 2012 .

[32]  Robert E. Roth,et al.  Interactive maps: What we know and what we need to know , 2013, J. Spatial Inf. Sci..

[33]  Shaowen Wang,et al.  A communication-aware framework for parallel spatially explicit agent-based models , 2013, Int. J. Geogr. Inf. Sci..

[34]  Rajiv Ranjan,et al.  Streaming Big Data Processing in Datacenter Clouds , 2014, IEEE Cloud Computing.

[35]  Chenghu Zhou,et al.  DECODE: a new method for discovering clusters of different densities in spatial data , 2009, Data Mining and Knowledge Discovery.

[36]  Thad Starner,et al.  Using GPS to learn significant locations and predict movement across multiple users , 2003, Personal and Ubiquitous Computing.

[37]  Christopher E. Oxendine,et al.  Using Non-authoritative Sources During Emergencies in Urban Areas , 2015 .

[38]  Wen Zeng,et al.  pRPL 2.0: Improving the Parallel Raster Processing Library , 2014 .

[39]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[40]  Qunying Huang,et al.  Geographic Situational Awareness: Mining Tweets for Disaster Preparedness, Emergency Response, Impact, and Recovery , 2015, ISPRS Int. J. Geo Inf..

[41]  Qunying Huang,et al.  Utilize cloud computing to support dust storm forecasting , 2013, Int. J. Digit. Earth.

[42]  Shaowen Wang,et al.  FluMapper: A cyberGIS application for interactive analysis of massive location‐based social media , 2014, Concurr. Comput. Pract. Exp..

[43]  Shaowen Wang A CyberGIS Framework for the Synthesis of Cyberinfrastructure, GIS, and Spatial Analysis , 2010 .

[44]  Qunying Huang,et al.  Using spatial principles to optimize distributed computing for enabling the physical science discoveries , 2011, Proceedings of the National Academy of Sciences.

[45]  Ming-Hsiang Tsou,et al.  Spatial, temporal, and content analysis of Twitter for wildfire hazards , 2016, Natural Hazards.

[46]  Antony Galton,et al.  Efficient generation of simple polygons for characterizing the shape of a set of points in the plane , 2008, Pattern Recognit..

[47]  Jie Yin,et al.  Emergency situation awareness from twitter for crisis management , 2012, WWW.

[48]  Huiji Gao,et al.  Harnessing the Crowdsourcing Power of Social Media for Disaster Relief , 2011, IEEE Intelligent Systems.

[49]  Robert E. Roth,et al.  Cartographic Interaction Primitives: Framework and Synthesis , 2012 .

[50]  Saloni Jain,et al.  Real-Time Social Network Data Mining For Predicting The Path For A Disaster , 2016 .

[51]  Zhenlong Li,et al.  Building Model as a Service to support geosciences , 2017, Comput. Environ. Urban Syst..

[52]  Philip D. Wasserman,et al.  Neural networks. II. What are they and why is everybody so interested in them now? , 1988, IEEE Expert.

[53]  Xun Shi,et al.  Selection of bandwidth type and adjustment side in kernel density estimation over inhomogeneous backgrounds , 2010, Int. J. Geogr. Inf. Sci..