GeoMesa: a distributed architecture for spatio-temporal fusion

Recent advances in distributed databases and computing have transformed the landscape of spatio-temporal machine learning. This paper presents GeoMesa, a distributed spatio-temporal database built on top of Hadoop and column-family databases such as Accumulo and HBase, that includes a suite of tools for indexing, managing and analyzing both vector and raster data. The indexing techniques use space filling curves to map multi-dimensional data to the single lexicographic list managed by the underlying distributed database. In contrast to traditional non-distributed RDBMS, GeoMesa is capable of scaling horizontally by adding more resources at runtime; the index rebalances across the additional resources. In the raster domain, GeoMesa leverages Accumulo's server-side iterators and aggregators to perform raster interpolation and associative map algebra operations in parallel at query time. The paper concludes with two geo-time data fusion examples: using GeoMesa to aggregate Twitter data by keywords; and georegistration to drape full-motion video (FMV) over terrain.