Exploiting distance coherence to speed up range queries in metric indexes

Similarity searching has a vast number of applications in numerous fields, such as audio and image databases, image quantization and compression, text or document databases, computational biology, and data mining, to name a few. A common aspect of all the applications is that we have a universe U of objects, and a non-negative distance function d : U× U→ R. The distance function is metric, if it satisfies for all x, y, z ∈ U

[1]  Walter A. Burkhard,et al.  Some approaches to best-match file searching , 1973, Commun. ACM.

[2]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[3]  Conrado Martínez,et al.  Fingered Multidimensional Search Trees , 2004, WEA.

[4]  Iraj Kalantari,et al.  A Data Structure and an Algorithm for the Nearest Point Problem , 1983, IEEE Transactions on Software Engineering.

[5]  Walid G. Aref,et al.  SINA: scalable incremental processing of continuous queries in spatio-temporal databases , 2004, SIGMOD '04.

[6]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[7]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[8]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[9]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[10]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[11]  E. Chavez,et al.  Pivot selection techniques for proximity searching in metric spaces , 2001, SCCC 2001. 21st International Conference of the Chilean Computer Science Society.

[12]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[13]  Gonzalo Navarro,et al.  An effective clustering algorithm to index high dimensional metric spaces , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.