Rapid Spatial Aggregation

Data visualization is an important component of spatial data analysis. We demonstrate the visualization of spatial/spatio-temporal data on map tiles as implemented in the R package RgoogleMaps. We argue that extremely large spatial or location data sets can lead to clutter and information overload necessitating aggregation to higher geographical identities. Such aggregation requires associating each coordinate point from the set to a particular spatial polygon in the search space. Examples for such polygon-based spatial partitions would be zip codes, census blocks, or school districts. Unless efficient data structures are used, this can be a computationally expensive task involving an exhaustive search across all prospective polygons. In this paper, we propose a methodology that exploits kd-trees as an efficient nearest neighbour search algorithm to significantly reduce the effective number of polygons being searched and expedite the lookup process. The kd-tree is built from either the polygon centroids and/or carefully chosen other points within the polygons. We further demonstrate a successful hybrid strategy by combining a range search with the tree based ranking. Our code has been made publicly available as the R package RapidPolygonLookup.

[1]  S. Openshaw Ecological Fallacies and the Analysis of Areal Census Data , 1984, Environment & planning A.

[2]  S. Openshaw A million or so correlation coefficients : three experiments on the modifiable areal unit problem , 1979 .

[3]  Joseph O'Rourke,et al.  Computational Geometry in C. , 1995 .

[4]  Stan Openshaw,et al.  Modifiable Areal Unit Problem , 2008, Encyclopedia of GIS.

[5]  J. O´Rourke,et al.  Computational Geometry in C: Arrangements , 1998 .

[6]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[7]  Virgilio,et al.  Applied Spatial Data Analysis with R || Visualising Spatial Data , 2013 .

[8]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[9]  Edzer J. Pebesma,et al.  Applied Spatial Data Analysis with R - Second Edition , 2008, Use R!.

[10]  Zack W. Almquist US Census Spatial and Demographic Data in R: The UScensus2000 Suite of Packages , 2010 .

[11]  P. Cortez,et al.  A data mining approach to predict forest fires using meteorological data , 2007 .

[12]  Tamás Dusek Spatially aggregated data and variables in empirical analysis and model building for economics , 2004 .

[13]  Nils B. Weidmann,et al.  Predicting Conflict in Space and Time , 2010 .

[14]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[15]  E. Pebesma,et al.  Classes and Methods for Spatial Data , 2015 .

[16]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.