A queries-based structure for similarity searching in static and dynamic metric spaces

Abstract This paper aims to develop a metric indexing method that uses users’ queries for reducing the search cost of similarity search systems and for avoiding the insertion cost in dynamic data sets. We have proposed an indexing method which is able to improve its structure based on users’ queries. The proposed method, called I-Clusters, is a metric clustering based method, extended from the List of Clusters method. This method decreases the construction costs, and it improves the search cost after the execution of queries. The I-Clusters method allows solving the trade-off between the construction cost and the searching cost, and it also allows indexing dynamic datasets without additional cost of objects insertion. The experiment results show that the I-Clusters method significantly reduces the search cost based on queries execution, and the search performance of the proposed method can reach that of List of Clusters.

[1]  Hassan Silkan,et al.  Fast and Efficient Indexing and Similarity Searching in 2D/3D Image Databases , 2015 .

[2]  Agma J. M. Traina,et al.  The NOBH-tree: Improving in-memory metric access methods by using metric hyperplanes with non-overlapping nodes , 2014, Data Knowl. Eng..

[3]  Luisa Micó,et al.  A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements , 1994, Pattern Recognit. Lett..

[4]  Gonzalo Navarro,et al.  New dynamic metric indices for secondary memory , 2016, Inf. Syst..

[5]  Edgar Chávez,et al.  Decomposability of DiSAT for Index Dynamization , 2017 .

[6]  Nieves R. Brisaboa,et al.  Spatial Selection of Sparse Pivots for Similarity Search in Metric Spaces , 2007, SOFSEM.

[7]  Gonzalo Navarro,et al.  Dynamic List of Clusters in Secondary Memory , 2014, SISAP.

[8]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[9]  H. Silkan,et al.  An Improvable Structure for Similarity Searching in Metric Spaces: Application on Image Databases , 2016, 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV).

[10]  Edgar Chávez,et al.  Extreme Pivots for Faster Metric Indexes , 2013, SISAP.

[11]  Jakub Lokoc,et al.  On indexing metric spaces using cut-regions , 2014, Inf. Syst..

[12]  Pavel Zezula,et al.  Multi-level Clustering on Metric Spaces Using a Multi-GPU Platform , 2013, Euro-Par.

[13]  Pavel Zezula,et al.  D-Index: Distance Searching Index for Metric Data Sets , 2003, Multimedia Tools and Applications.

[14]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[15]  Gonzalo Navarro,et al.  A compact space decomposition for effective metric indexing , 2005, Pattern Recognit. Lett..

[16]  E. Ruiz An algorithm for finding nearest neighbours in (approximately) constant average time , 1986 .