An Estimating Model of Node Accesses for INN Search Algorithm in Multidimensional Spaces

Nearest Neighbor (NN) search has been widely used in spatial databases and multimedia databases. Incremental NN (INN) search is regarded as the optimal NN search because of the minimum number of node accesses and it can be used no mat- ter whether the number of objects to be retrieved is xed or not in advance. R*-tree is still regarded as being among the best high-dimensional indices. This paper presents an analytical model for estimating per- formance of the INN search algorithm on R*-tree. For the rst time, our model takes m (the number of neighbor objects reported nally), n (database cardi- nality) and d (dimensionality) as parameters, focus- ing on the number of node accesses. In our model, (1) the two key factors of dm (distance from the m-th NN object to the query point) and h (side length of each node) are estimated using their upper bounds and their lower bounds, which is very helpful to ef- fectiveness of our model; (2) the particularity on the number of entries in the root node and the possible dierence of fanouts between the leaf nodes and the other nodes are taken into account. The theoretical analysis is veried by experiments.