A Model for k-Nearest Neighbor Query Processing Cost in Multidimensional Data Space

A cost model for the performance of the k-nearest neighbor query in multidimensional data space is presented. Two concepts, the regional average volume and the density function, are introduced to predict the performance for uniform and non-uniform data distributions. The experiment shows that the prediction based on this model is accurate within an acceptable range of the error in low and mid dimensions.