Spatiotemporal data mining in the era of big spatial data: algorithms and applications

Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from the spatial and spatiotemporal data. However, explosive growth in the spatial and spatiotemporal data, and the emergence of social media and location sensing technologies emphasize the need for developing new and computationally efficient methods tailored for analyzing big data. In this paper, we review major spatial data mining algorithms by closely looking at the computational and I/O requirements and allude to few applications dealing with big spatial data.

[1]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[4]  Haluk Derin,et al.  Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[6]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[7]  Noel A Cressie,et al.  Statistics for Spatial Data, Revised Edition. , 1994 .

[8]  Philip H. Swain,et al.  Bayesian contextual classification based on modified M-estimates and Markov random fields , 1996, IEEE Trans. Geosci. Remote. Sens..

[9]  Anil K. Jain,et al.  A Markov random field model for classification of multisource satellite imagery , 1996, IEEE Trans. Geosci. Remote. Sens..

[10]  James P. LeSage Regression Analysis of Spatial Data , 1997 .

[11]  R. Pace,et al.  Sparse spatial autoregressions , 1997 .

[12]  J. LeSage Bayesian Estimation of Spatial Autoregressive Models , 1997 .

[13]  Alan H. Strahler,et al.  The Moderate Resolution Imaging Spectroradiometer (MODIS): land remote sensing for global change research , 1998, IEEE Trans. Geosci. Remote. Sens..

[14]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[15]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Marijke F. Augusteijn,et al.  Fusion of image classifications using Bayesian techniques with Markov random fields , 1999 .

[17]  J. LeSage,et al.  Spatial Dependence in Data Mining , 2001 .

[18]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  David Murakami Wood,et al.  The Growth of CCTV: a global perspective on the international diffusion of video surveillance in publicly accessible space , 2002 .

[20]  Weili Wu,et al.  Spatial contextual classification and prediction models for mining geospatial data , 2002, IEEE Trans. Multim..

[21]  Shashi Shekhar,et al.  Comparing Exact and Approximate Spatial Auto-regression Model Solutions for Spatial Data Analysis , 2004, GIScience.

[22]  Chunsheng Ma,et al.  Spatial autoregression and related spatio-temporal models , 2004 .

[23]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[24]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  George Ostrouchov,et al.  Nonlinear statistics reveals stronger ties between ENSO and the tropical hydrological cycle , 2006 .

[26]  Hillol Kargupta,et al.  On-board Vehicle Data Stream Monitoring Using MineFleet and Fast Resource Constrained Monitoring of Correlation Matrices , 2006, New Generation Computing.

[27]  Shashi Shekhar,et al.  An efficient spatial semi-supervised learning algorithm , 2007, Int. J. Parallel Emergent Distributed Syst..

[28]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[29]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[30]  Shashi Shekhar,et al.  NORTHSTAR: A Parameter Estimation Method for the Spatial Autoregression Model , 2007 .

[31]  Karsten Steinhaeuser,et al.  Data Mining for Climate Change and Impacts , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[32]  Goo Jun,et al.  Spatially Adaptive Classification and Active Learning of Multispectral Data with Gaussian Processes , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[33]  Ranga Raju Vatsavai BioMon: a Google Earth based continuous biomass monitoring system , 2009, GIS.

[34]  Steffen Fritz,et al.  Geo-Wiki.Org: The Use of Crowdsourcing to Improve Global Land Cover , 2009, Remote. Sens..

[35]  Karsten Steinhaeuser,et al.  Higher trends but larger uncertainty and geographic variability in 21st century temperature and heat waves , 2009, Proceedings of the National Academy of Sciences.

[36]  Stan Z. Li Markov Random Field Modeling in Image Analysis , 2009, Advances in Pattern Recognition.

[37]  Anil M. Cheriyadat,et al.  Unsupervised Semantic Labeling Framework for Identification of Complex Facilities in High-Resolution Remote Sensing Images , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[38]  Varun Chandola,et al.  Scalable Time Series Change Detection for Biomass Monitoring Using Gaussian Process , 2010, CIDU.

[39]  Colin M Beale,et al.  Regression analysis of spatial data. , 2010, Ecology letters.

[40]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[41]  Richard Brittaine,et al.  Jatropha: a smallholder bioenergy crop: the potential for pro-poor development. , 2010 .

[42]  João Gama,et al.  The next generation of transportation systems,greenhouse emissions, and data mining , 2010, KDD.

[43]  J. Overpeck,et al.  Climate Data Challenges in the 21st Century , 2011, Science.

[44]  Varun Chandola,et al.  A Gaussian Process Based Online Change Detection Algorithm for Monitoring Periodic Time Series , 2011, SDM.

[45]  Nitesh V. Chawla,et al.  Multivariate and multiscale dependence in the global climate system revealed through complex networks , 2012, Climate Dynamics.

[46]  Farhan Sahito,et al.  Weaving Twitter stream into Linked Data a proof of concept framework , 2011, 2011 7th International Conference on Emerging Technologies.

[47]  A. Ganguly,et al.  Intensity, duration, and frequency of precipitation extremes under 21st-century warming scenarios , 2011 .

[48]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[49]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[50]  Arie Shoshani,et al.  In situ data processing for extreme-scale computing , 2011 .

[51]  Varun Chandola,et al.  A scalable gaussian process analysis algorithm for biomass monitoring , 2011, Stat. Anal. Data Min..

[52]  Ranga Raju Vatsavai STPMiner: a highperformance spatiotemporal pattern mining toolbox , 2011, PDAC '11.

[53]  Zoran Obradovic,et al.  Mining Extremes: Severe Rainfall and Climate Change , 2012, ECAI.

[54]  Snigdhansu Chatterjee,et al.  Sparse Group Lasso: Consistency and Climate Applications , 2012, SDM.

[55]  Anil M. Cheriyadat,et al.  Image Based Characterization of Formal and Informal Neighborhoods in an Urban Landscape , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[56]  Shashi Shekhar,et al.  Spatial big-data challenges intersecting mobility and cloud computing , 2012, MobiDE '12.

[57]  Anthony Stefanidis,et al.  #Earthquake: Twitter as a Distributed Sensor System , 2013, Trans. GIS.

[58]  A. Stefanidis,et al.  Harvesting ambient geospatial information from social media feeds , 2011, GeoJournal.