Detection and Prediction of Natural Hazards Using Large-Scale Environmental Data

Recent developments in remote sensing have made it possible to instrument and sense the physical world with high resolution and fidelity. Consequently, very large spatio-temporal environmental data sets, have become available to the research community. Such data consists of time-series, starting as early as 1973, monitoring up to thousands of environmental parameters, for each spatial region of a resolution as low as \(0.5'\times 0.5'\). To make this flood of data actionable, in this work, we employ a data driven approach to detect and predict natural hazards. Our supervised learning approach learns from labeled historic events. We describe each event by a three-mode tensor, covering space, time and environmental parameters. Due to the very large number of environmental parameters, and the possibility of latent features hidden within these parameters, we employ a tensor factorization approach to learn latent factors. As the corresponding tensors can grow very large, we propose to employ an outlier-score for sparsification, thus explicitly modeling interesting (location, time, parameter) triples only. In our experimental evaluation, we apply our data-driven learning approach to the use-case of predicting the rapid-intensification of tropical storms. Learning from past tropical storms, we show that our approach is able to predict the future rapid-intesification of tropical storms with high accuracy, matching the accuracy of domain specific solutions, yet without using any domain knowledge.

[1]  Stephan Stephany,et al.  A new classification approach for detecting severe weather patterns , 2013, Comput. Geosci..

[2]  Ruixin Yang,et al.  Improved associated conditions in rapid intensifications of tropical cyclones , 2007 .

[3]  Koh Takeuchi,et al.  Non-negative Multiple Tensor Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[4]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[5]  Ruixin Yang A Systematic Classification Investigation of Rapid Intensification of Atlantic Tropical Cyclones with the SHIPS Database , 2016 .

[6]  Christos Faloutsos,et al.  HaTen2: Billion-scale tensor decompositions , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[7]  Karsten Steinhaeuser,et al.  Data Mining for Climate Change and Impacts , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[8]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[9]  Ruixin Yang,et al.  Association Rule Data Mining Applications for Atlantic Tropical Cyclone Intensity Changes , 2011 .

[10]  S. Réjichi,et al.  SVM spatio-temporal vegetation classification using HR satellite images , 2011, Remote Sensing.

[11]  Kijung Shin,et al.  Distributed Methods for High-Dimensional and Large-Scale Tensor Factorization , 2014, 2014 IEEE International Conference on Data Mining.

[12]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[13]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[14]  Mark DeMaria,et al.  A Statistical Hurricane Intensity Prediction Scheme (SHIPS) for the Atlantic Basin , 1994 .

[15]  Ruixin Yang,et al.  MINING “ OPTIMAL ” CONDITIONS FOR RAPID INTENSIFICATIONS OF TROPICAL CYCLONES , 2008 .

[16]  Ying Cui,et al.  Tensor factorization-based classification of Alzheimer's disease vs healthy controls , 2012, 2012 5th International Conference on BioMedical Engineering and Informatics.

[17]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[18]  James H. Faghmous,et al.  Spatio-temporal Data Mining for Climate Data: Advances, Challenges, and Opportunities , 2014 .

[19]  S. Schubert,et al.  MERRA: NASA’s Modern-Era Retrospective Analysis for Research and Applications , 2011 .

[20]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[21]  Arto Kaarna,et al.  Sea Ice SAR Feature Extraction by Non-Negative Matrix and Tensor Factorization , 2008, IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium.

[22]  Ricardo Todling,et al.  The GEOS-5 Data Assimilation System-Documentation of Versions 5.0.1, 5.1.0, and 5.2.0 , 2008 .

[23]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[24]  Wesley W. Chu,et al.  Data Mining and Knowledge Discovery for Big Data , 2014 .

[25]  Dawei Wang,et al.  A Hierarchical Pattern Learning Framework for Forecasting Extreme Weather Events , 2015, 2015 IEEE International Conference on Data Mining.

[26]  Mark DeMaria,et al.  Large-Scale Characteristics of Rapidly Intensifying Tropical Cyclones in the North Atlantic Basin , 2003 .

[27]  Jun A. Zhang,et al.  Evaluating Environmental Impacts on Tropical Cyclone Rapid Intensification Predictability Utilizing Statistical Models , 2015 .

[28]  Srinivasan Parthasarathy,et al.  Anomaly detection and spatio-temporal analysis of global climate system , 2009, SensorKDD '09.

[29]  Tamara G. Kolda,et al.  All-at-once Optimization for Coupled Matrix and Tensor Factorizations , 2011, ArXiv.

[30]  Vipin Kumar,et al.  Discovery of climate indices using clustering , 2003, KDD '03.

[31]  Safa Rejichi,et al.  Feature extraction using PCA for VHR satellite image time series spatio-temporal classification , 2015, 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).