Scaling the Colombian Data Cube Using a Distributed Architecture

The main goal of CDCol initiative is enabling users from Colombian institutions to develop algorithms, run analysis and create products from large datasets of remote sensing images. The first version of the CDCol platform use a single server architecture that limits the amount of analysis produced and it is not horizontally scalable. This paper presents a distributed architecture for CDCol, which was defined and tested on Amazon Web Services (AWS), whose purpose is evaluating how it can increase the throughput of CDCol platform using several datacube servers. Performance tests executed on AWS show the architecture can increase the number of tasks processed concurrently and reduce the execution time for large area analysis.