Faster Proximity Searching with the Distal SAT

In this paper we present the Distal Spatial Approximation Tree (DiSAT), an algorithmic improvement of SAT. Our improvement increases the discarding power of the SAT by selecting distal nodes instead of the proximal nodes proposed in the original paper. Our approach is parameter free and it was the most competitive in an extensive benchmarking, from two to forty times faster than the SAT, and faster than the List of Clusters (LC) which is considered the state of the art for main memory, linear sized indexes in the model of distance computations.

[1]  Vlastislav Dohnal,et al.  An Access Structure for Similarity Search in Metric Spaces , 2004, EDBT Workshops.

[2]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[3]  Pavel Zezula,et al.  Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.

[4]  Václav Snásel,et al.  PM-tree: Pivoting Metric Tree for Similarity Search in Multimedia Databases , 2004, ADBIS.

[5]  E. Ruiz An algorithm for finding nearest neighbours in (approximately) constant average time , 1986 .

[6]  Gonzalo Navarro,et al.  Dynamic spatial approximation trees , 2008, JEAL.

[7]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[8]  Gonzalo Navarro Searching in metric spaces by spatial approximation , 2002, The VLDB Journal.

[9]  Pavel Zezula,et al.  D-Index: Distance Searching Index for Metric Data Sets , 2003, Multimedia Tools and Applications.

[10]  Gonzalo Navarro,et al.  Analyzing Metric Space Indexes: What For? , 2009, 2009 Second International Workshop on Similarity Search and Applications.

[11]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[12]  Michael E. Houle,et al.  Rank Cover Trees for Nearest Neighbor Search , 2013, SISAP.

[13]  Gonzalo Navarro,et al.  A compact space decomposition for effective metric indexing , 2005, Pattern Recognit. Lett..

[14]  Hanan Samet,et al.  Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling) , 2005 .