Fast multidimensional nearest neighbor search algorithm using priority queue

Nearest neighbor search in high-dimensional spaces is an interesting and important problem which is relevant for a wide variety of applications, including multimedia information retrieval, data mining, and pattern recognition. For such applications, the curse of high dimensionality tends to be a major obstacle in the development of efficient search methods. This paper addresses the problem of designing an efficient algorithm for high-dimensional nearest neighbor search using a priority queue. The proposed algorithm is based on a simple linear search algorithm and eliminates unnecessary arithmetic operations from distance computations between multidimensional vectors. Moreover, we propose two techniques, a dimensional sorting method and a PCA-based method, to accelerate multidimensional search. Experimental results indicate that our scheme scales well even for a very large number of dimensions. © 2008 Wiley Periodicals, Inc. Electr Eng Jpn, 164(3): 69–77, 2008; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/eej.20502

[1]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[2]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[3]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[4]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[5]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[6]  Jesse S. Jin Indexing and Retrieving High Dimensional Visual Features , 2003 .

[7]  Harvey F. Silverman,et al.  A Class of Algorithms for Fast Digital Image Registration , 1972, IEEE Transactions on Computers.

[8]  Yacov Hel-Or,et al.  Real-time pattern matching using projection kernels , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[10]  Z. Meral Özsoyoglu,et al.  Indexing large metric spaces for similarity search queries , 1999, TODS.

[11]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[12]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[13]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[14]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[15]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[16]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[17]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.