A NEW INITIATIVE FOR TILING, STITCHING AND PROCESSING GEOSPATIAL BIG DATA IN DISTRIBUTED COMPUTING ENVIRONMENTS

Abstract. Within recent years, several new approaches and solutions for Big Data processing have been developed. The Geospatial world is still facing the lack of well-established distributed processing solutions tailored to the amount and heterogeneity of geodata, especially when fast data processing is a must. The goal of such systems is to improve processing time by distributing data transparently across processing (and/or storage) nodes. These types of methodology are based on the concept of divide and conquer. Nevertheless, in the context of geospatial processing, most of the distributed computing frameworks have important limitations regarding both data distribution and data partitioning methods. Moreover, flexibility and expendability for handling various data types (often in binary formats) are also strongly required. This paper presents a concept for tiling, stitching and processing of big geospatial data. The system is based on the IQLib concept ( https://github.com/posseidon/IQLib/ ) developed in the frame of the IQmulus EU FP7 research and development project ( http://www.iqmulus.eu ). The data distribution framework has no limitations on programming language environment and can execute scripts (and workflows) written in different development frameworks (e.g. Python, R or C#). It is capable of processing raster, vector and point cloud data. The above-mentioned prototype is presented through a case study dealing with country-wide processing of raster imagery. Further investigations on algorithmic and implementation details are in focus for the near future.