An efficient query system for high-dimensional spatio-temporal data

This thesis is directed toward developing an efficient query system for high-dimensional spatio-temporal data sets, where each data point comprises a 3D spatial location, a time instance, and other attributes. Our approach consists of three phases. In the first phase, we preprocess the data set and partition it into clusters. In particular, we develop an I/O efficient, fast clustering method for high-dimensional data without leaving out any dimension. This provides a solid foundation for the next two phases. In the second phase, we apply computational geometry algorithms to compute the 3D spatial boundaries of the clusters and triangulate the space enclosed in each boundary into a set of tetrahedrons, where each tetrahedron represents all the data points contained in it. In the third phase, we establish a 3D spatial data model to manipulate the data acquired after the second phase. We study the 3D spatial operations and relationships in this phase. We then extend the SQL language to support efficient 3D spatial data queries. Finally, we present an implementation of such a query system.