BigDataVoyant: Automated Profiling of Large Geospatial Data

We envisage an open, extensible, and scalable data profiling framework over various types of geospatial data, including vector, raster and multidimensional assets. In this paper, we outline our work in progress regarding the design and implementation of BigDataVoyant, a software platform for profiling big geospatial data. This software is able to ingest data in various spatial formats and reference systems. Its main goal is to extract and visualize a large variety of metadata and descriptions about data quality and characteristics both in an interactive as well as in a fully automated manner. We suggest a processing flow for such profiling and discuss a preliminary, yet comprehensive list of metadata items already supported by the open-source software prototype we are implementing. Finally, we outline open issues and extensions of the proposed framework to broaden its usefulness and strengthen its appeal to the geospatial data community.