Where are you heading, metric access methods?: a provocative survey

In this paper the impact of the metric indexing paradigm on the real-world applications is discussed. We pose questions whether the priorities in research of metric access methods (MAMs) established in the past decades reflect the actual needs of practitioners. In particular, we formulate the following pragmatic questions: Are the established MAM cost measures relevant? Isn't the metric space model too general when the majority of real-world applications use Lp spaces? On the other hand, isn't the metric model too restrictive with respect to the growing community of practitioners using non-metric distances? Are the simple similarity queries competitive enough? Have the real-world similarity search engines ever used a general metric access method, or do they use specific indexing? Is there a real demand for content-based similarity search or will the annotations and keyword search win the game? We present justification of these questions, investigating relevant literature and search engines. Finally, we try to transform the questions into answers and suggestions to the future research on MAMs.

[1]  Shih-Fu Chang,et al.  Reranking Methods for Visual Search , 2007, IEEE MultiMedia.

[2]  Gonzalo Navarro,et al.  A Probabilistic Spell for the Curse of Dimensionality , 2001, ALENEX.

[3]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[4]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[5]  Tomás Skopal,et al.  Unified framework for fast exact and approximate search in dissimilarity spaces , 2007, TODS.

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[8]  Tomás Skopal,et al.  On Fuzzy vs. Metric Similarity Search in Complex Databases , 2009, FQAS.

[9]  Eamonn J. Keogh,et al.  LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures , 2006, VLDB.

[10]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[11]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Xiang Lian,et al.  Efficient Processing of Metric Skyline Queries , 2009, IEEE Transactions on Knowledge and Data Engineering.

[13]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[14]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[15]  Pavel Zezula,et al.  Similarity Search: The Metric Space Approach (Advances in Database Systems) , 2005 .

[16]  Pavel Zezula,et al.  Processing Complex Similarity Queries with Distance-Based Access Methods , 1998, EDBT.

[17]  Benjamin Bustos,et al.  On nonmetric similarity search problems in complex domains , 2011, CSUR.

[18]  Luisa Micó,et al.  A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements , 1994, Pattern Recognit. Lett..

[19]  Alberto Del Bimbo,et al.  Retrieval by Shape Similarity with Perceptual Distance and Effective Indexing , 2000, IEEE Trans. Multim..

[20]  H. Samet,et al.  Incremental Similarity Search in Multimedia Databases , 2000 .

[21]  Magnus Lie Hetland The Basic Principles of Metric Indexing , 2009 .

[22]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.