A review on application of data mining techniques to combat natural disasters

Abstract Thousands of human lives are lost every year around the globe, apart from significant damage on property, animal life, etc., due to natural disasters (e.g., earthquake, flood, tsunami, hurricane and other storms, landslides, cloudburst, heat wave, forest fire). In this paper, we focus on reviewing the application of data mining and analytical techniques designed so far for (i) prediction, (ii) detection, and (iii) development of appropriate disaster management strategy based on the collected data from disasters. A detailed description of availability of data from geological observatories (seismological, hydrological), satellites, remote sensing and newer sources like social networking sites as twitter is presented. An extensive and in-depth literature study on current techniques for disaster prediction, detection and management has been done and the results are summarized according to various types of disasters. Finally a framework for building a disaster management database for India hosted on open source Big Data platform like Hadoop in a phased manner has been proposed. The study has special focus on India which ranks among top five counties in terms of absolute number of the loss of human life.

[1]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[2]  Jui-Yi Ho,et al.  Influences of spatial distribution of soil thickness on shallow landslide prediction , 2012 .

[3]  Catherine L. Cohan,et al.  Life course transitions and natural disaster: marriage, birth, and divorce following Hurricane Hugo. , 2002 .

[4]  R. Bhardwaj,et al.  Real-time nowcast of a cloudburst and a thunderstorm event with assimilation of Doppler weather radar data , 2013, Natural Hazards.

[5]  Michelle R. Guy,et al.  USGS Tweet Earthquake Dispatch (@USGSted): Using Twitter for Earthquake Detection and Characterization , 2012 .

[6]  A. Culotta,et al.  A Demographic Analysis of Online Sentiment during Hurricane Irene , 2012 .

[7]  Kirill Kireyev Applications of Topics Models to Analysis of Disaster-Related Twitter Data , 2009 .

[8]  Xiang Li,et al.  The Data Mining Technology of Particle Swarm Optimization Algorithm in Earthquake Prediction , 2014, CIT 2014.

[9]  Akemi Takeoka Chatfield,et al.  Twitter Early Tsunami Warning System: A Case Study in Indonesia's Natural Disaster Management , 2013, 2013 46th Hawaii International Conference on System Sciences.

[10]  K. Pabreja,et al.  Clustering technique to interpret Numerical Weather Prediction output products for forecast of Cloudburst , 2012 .

[11]  Aron Culotta,et al.  Tweedr: Mining twitter to inform disaster response , 2014, ISCRAM.

[12]  Arunkumar Thangavelu,et al.  An Improved Bayesian Classification Data Mining Method for Early Warning Landslide Susceptibility Model Using GIS , 2012, BIC-TA.

[13]  Mohammad Ali Abbasi,et al.  TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief , 2011, ICWSM.

[14]  Zong Woo Geem,et al.  A new nonlinear Muskingum flood routing model incorporating lateral flow , 2015 .

[15]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[16]  Mustafa Neamah Jebur,et al.  Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS , 2013 .

[17]  An Efficient Tool to Assess Risk of Storm Surges Using Data Mining , 2013 .

[18]  Mukat Lal Sharma,et al.  Earthquake Prediction through Animal Behavior: A Review , 2009 .

[19]  Jacob Ratkiewicz,et al.  Predicting the Political Alignment of Twitter Users , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[20]  Uday B. Desai,et al.  Senslide: a distributed landslide prediction system , 2007, OPSR.

[21]  Zong Woo Geem,et al.  Discussion of "Parameter Estimation of the Nonlinear Muskingum Flood - Routing Model , 2014 .

[22]  Christopher Cheong,et al.  Social Media Data Mining: A Social Network Analysis Of Tweets During The 2010-2011 Australian Floods , 2011, PACIS.

[23]  David S. Ebert,et al.  Public behavior response analysis in disaster events utilizing visual analytics of microblog data , 2014, Comput. Graph..

[24]  Daniel Oranova Siahaan,et al.  Web-Based Tsunami Early Warning System , 2013 .

[25]  Hojjat Adeli,et al.  A probabilistic neural network for earthquake magnitude prediction , 2009, Neural Networks.

[26]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[27]  Priyanka Tripathi,et al.  Knowledge Discovery from Earth Science Data , 2014, 2014 Fourth International Conference on Communication Systems and Network Technologies.

[28]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[29]  F. Pappenberger,et al.  A pan-African Flood Forecasting System , 2014 .

[30]  Basabi Chakraborty,et al.  Temporal Awareness of Needs after East Japan Great Earthquake using Latent Semantic Analysis , 2013, EJC.

[31]  Liang Tang,et al.  Data Mining Meets the Needs of Disaster Information Management , 2013, IEEE Transactions on Human-Machine Systems.

[32]  O. Korup,et al.  Landslide prediction from machine learning , 2014 .

[33]  Evan F. Bollig,et al.  Clustering and visualization of earthquake data in a grid environment , 2005 .

[34]  Kavita Pabreja,et al.  A data warehousing and data mining approach for analysis and forecast of cloudburst events using OLAP-based data hypercube , 2012, Int. J. Data Anal. Tech. Strateg..

[35]  Zhigang Zeng,et al.  Deformation Prediction of Landslide Based on Improved Back-propagation Neural Network , 2012, Cognitive Computation.

[36]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[37]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[38]  M. M. Ali,et al.  Predicting cyclone tracks in the north Indian Ocean: An artificial neural network approach , 2007 .

[39]  Larry G. Mastin,et al.  Improved prediction and tracking of volcanic ash clouds , 2009 .

[40]  Tien-Yin Chou,et al.  A novel data mining technique of analysis and classification for landslide problems , 2009 .

[41]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[42]  Milton Halem,et al.  Social Media Data Analytics Applied to Hurricane Sandy , 2013, 2013 International Conference on Social Computing.

[43]  Son Doan,et al.  An analysis of Twitter messages in the 2011 Tohoku Earthquake , 2011, eHealth.

[44]  N. Adger,et al.  Country level risk measures of climate-related natural disasters and implications for adaptation to climate change , 2003 .

[45]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[46]  Mehmet Karakose,et al.  The Prediction Algorithm Based on Fuzzy Logic Using Time Series Data Mining Method , 2009 .

[47]  Chad Stecher,et al.  The Impact of Natural Disasters on Child Health and Investments in Rural India , 2011, Social science & medicine.

[48]  Ynag Shu-feng Prediction of landslide hazard based on support vector machine theory. , 2003 .

[49]  Akemi Takeoka Chatfield,et al.  Twitter tsunami early warning network : a social network analysis of Twitter information flows , 2012 .

[50]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[51]  S.L. Nimmagadda,et al.  Ontology based data warehouse modeling and mining of earthquake data: prediction analysis along Eurasian-Australian continental plates , 2007, 2007 5th IEEE International Conference on Industrial Informatics.

[52]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[53]  Vasudeva Varma,et al.  Structured Information Extraction from Natural Disaster Events on Twitter , 2014, Web-KR '14.

[54]  Giuseppe Nunnari,et al.  Multivariate time series clustering on geophysical data recorded at Mt. Etna from 1996 to 2003 , 2013 .

[55]  Bruno Merz,et al.  Multi-variate flood damage assessment: a tree-based data-mining approach , 2013 .

[56]  F. Catani,et al.  Web data mining for automatic inventory of geohazards at national scale , 2013 .

[57]  M. Siek,et al.  Nonlinear Processes in Geophysics Nonlinear chaotic model for predicting storm surges , 2022 .

[58]  R. Sahay,et al.  Predicting Monsoon Floods in Rivers Embedding Wavelet Transform, Genetic Algorithm and Neural Network , 2013, Water Resources Management.