论文信息 - Exploiting Sample-Data Distributions to Reduce the Cost of Nearest-Neighbor Searches with Kd-Trees

Exploiting Sample-Data Distributions to Reduce the Cost of Nearest-Neighbor Searches with Kd-Trees

We present KD-DT, an algorithm that uses a decision-tree-inspired measure to build a kd-tree for low cost nearest-neighbor searches. The algorithm starts with a "standard" kd-tree and uses searches over a training set to evaluate and improve the structure of the kd-tree. In particular, the algorithm builds a tree that better insures that a query and its nearest neighbors will be in the same subtree(s), thus reducing the cost of subsequent search.

Douglas A. Talbert | Douglas H. Fisher

[1] D. Du,et al. Computing in Euclidean Geometry , 1995 .

[2] Andrew W. Moore,et al. Multiresolution Instance-Based Learning , 1995, IJCAI.

[3] Douglas A. Talbert,et al. OPT-KD: An Algorithm for Optimizing Kd-Trees , 1999, ICML.

[4] Belur V. Dasarathy,et al. Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[5] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .

[6] Peter N. Yianilos,et al. Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[7] Jon Louis Bentley,et al. Multidimensional divide-and-conquer , 1980, CACM.

[8] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[9] Franz Aurenhammer,et al. Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[10] Leon Sterling,et al. A CBR/RBR Hybrid for Designing Nutritional Menus , 1998 .