SIDI: A Scalable in-Memory Density-based Index for Spatial Databases

With wide-spread use of location-based services, spatial data is becoming popular. As the data is usually huge in volume and continuously arriving to the storage in real-time, designing systems for efficiently storing this type of data is challenging. Two major issues that make building such system become complicated are the skewed distribution of data and the need of scaling the storage on multiple machines. In this paper, we propose a novel scalable in-memory density-based index for spatial databases. The key principle underlying our design is the exploitation of the stable spatial distribution of the datasets to deploy a simple but efficient index structure. We used information extracted from data in the past to split the entire space into independent pieces with similar density to ensure load-balancing and scalability. Experimental results show that the proposed solution scales well in distributed environment and outperforms common indexes in many cases.