Geospatial query processing in surface and hybrid spaces

The growing popularity of online Earth visualization tools and geo-realistic games and the availability of high resolution terrain data have motivated a new class of queries: spatial queries over the land surface, which extends the traditional spatial queries to a constrained third dimension. The fundamental technical challenge that prevents the realization of these applications is in fact in the area of data management. In particular, real-world large geospatial datasets residing on disk drives need to be queried and accessed as if they are synthetically rendered data in memory. Unfortunately, the majority of current disk-based data structures are designed to expedite the rendering of this geo-realistic data (e.g., Google Earth™) rather than its querying and access. In this thesis, I mainly discuss a series of efficient index structures on a subset of this data set. These data structures are critical to expedite several important classes of spatial queries in the geospatial databases: the snapshot and continuous k Nearest Neighbor (kNN) queries on terrain surface, and the scalable browsing of the shortest surface paths (i.e., the path query). In addition, for the first time, the spatial queries have been extended to the more realistic hybrid environment, which overlays the real world road networks on top of the terrain models. Since finding the shortest hybrid path is new and more challenging, disk efficient data structures and algorithms have been proposed to minimize the I/O cost as well as to be seamlessly integrated to our surface index framework. In this thesis, all the discussed methods have been experimentally verified against large scale real world and synthetic datasets.