Fast and versatile algorithm for nearest neighbor search based on a lower bound tree

In this paper, we present a fast and versatile algorithm which can rapidly perform a variety of nearest neighbor searches. Efficiency improvement is achieved by utilizing the distance lower bound to avoid the calculation of the distance itself if the lower bound is already larger than the global minimum distance. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points to be searched. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for traversing the tree. For further efficiency improvement, data transformation can be applied to the sample and the query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors whose distances to the query are sufficiently close to the minimum distance of the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively small compared with its distance to most other samples (which is the case for many object recognition problems).

[1]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[2]  Kuldip K. Paliwal,et al.  Fast nearest-neighbor search algorithms based on approximation-elimination search , 2000, Pattern Recognit..

[3]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[4]  Soo-Ik Chae,et al.  Fast Design of Reduced-Complexity Nearest-Neighbor Classifiers Using Triangular Inequality , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Hans-Peter Kriegel,et al.  The pyramid-technique: towards breaking the curse of dimensionality , 1998, SIGMOD '98.

[7]  Abdelhamid Djouadi On the Reduction of the Nearest-Neighbor Variation for More Accurate Classification and Error Estimates , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  BerchtoldStefan,et al.  The pyramid-technique , 1998 .

[9]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[10]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[11]  Chaur-Heh Hsieh,et al.  Fast search algorithms for vector quantization of images using multiple triangle inequalities and wavelet transform , 2000, IEEE Trans. Image Process..

[12]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[13]  Sameer A. Nene,et al.  A simple algorithm for nearest neighbor search in high dimensions , 1997 .

[14]  M. Reza Soleymani,et al.  An Efficient Nearest Neighbor Search Method , 1987, IEEE Trans. Commun..

[15]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[16]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[17]  KatayamaNorio,et al.  The SR-tree , 1997 .

[18]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[19]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[20]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[21]  Essaid Bouktache,et al.  A Fast Algorithm for the Nearest-Neighbor Classifier , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Marc Levoy,et al.  Fast texture synthesis using tree-structured vector quantization , 2000, SIGGRAPH.

[23]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[24]  Truong Q. Nguyen,et al.  Wavelets and filter banks , 1996 .

[25]  Christian Böhm,et al.  Fast parallel similarity search in multimedia databases , 1997, SIGMOD '97.

[26]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[27]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.

[28]  James McNames,et al.  A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[30]  Roberto Manduchi,et al.  Stereo Matching as a Nearest-Neighbor Problem , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Hans-Peter Kriegel,et al.  Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space , 2000, IEEE Trans. Knowl. Data Eng..

[32]  Yi-Ping Hung,et al.  Winner-update algorithm for nearest neighbor search , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[33]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[34]  Enrique Vidal,et al.  New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA) , 1994, Pattern Recognit. Lett..

[35]  Forest Baskett,et al.  An Algorithm for Finding Nearest Neighbors , 1975, IEEE Transactions on Computers.

[36]  Christos Faloutsos,et al.  The TV-tree: An index structure for high-dimensional data , 1994, The VLDB Journal.

[37]  Ronald Fagin,et al.  Relaxing the Triangle Inequality in Pattern Matching , 2004, International Journal of Computer Vision.

[38]  Yi-Ping Hung,et al.  Fast block matching algorithm based on the winner-update strategy , 2001, IEEE Trans. Image Process..

[39]  András Faragó,et al.  Fast Nearest-Neighbor Search in Dissimilarity Spaces , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  BentleyJon Louis Multidimensional binary search trees used for associative searching , 1975 .

[41]  David G. Stork,et al.  Pattern Classification , 1973 .

[42]  Chang-Hsing Lee,et al.  A fast search algorithm for vector quantization using mean pyramids of codewords , 1995, IEEE Trans. Commun..