Quality-aware and load sensitive planning of image similarity queries

Evaluating similarity queries over image collections effectively and efficiently is an important but difficult issue. In many settings, a system does not deal with individual queries in isolation, there rather is a stream of queries. Researchers have proposed a number of query-evaluation alternatives and generalizations, in particular parallel methods over several components, and methods that yield approximate results. Choosing a plan for a given query is subject to more criteria than in conventional settings, notably result quality next to response time and resource consumption. We have designed and implemented a query planner that incorporates these concepts. We describe our space of possible plans and how we search this space. The usefulness of such a planner depends on a number of criteria, e.g., increase of throughput, adaptivity to different workloads, query planning overhead, or influence of the scoring function in quantitative terms. This article describes respective evaluations and shows that the benefit of our particular approach is significant.

[1]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[2]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[3]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[4]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[5]  Kristin P. Bennett,et al.  Density-based indexing for approximate nearest-neighbor queries , 1999, KDD '99.

[6]  Klemens Böhm,et al.  Trading Quality for Time with Nearest Neighbor Search , 2000, EDBT.

[7]  Jonathan Goldstein,et al.  Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches , 2000, VLDB.

[8]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[9]  Hans-Jörg Schek,et al.  Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files , 2000, ECDL.

[10]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[11]  Masatoshi Yoshikawa,et al.  The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation , 2000, VLDB.

[12]  SearchforVisualizationUsingArbitrary,et al.  Nearest Neighbour , 1996 .

[13]  Stefan Berchtold,et al.  A Cost Model For Nearest Neighbour Search , 1997, PODS 1997.

[14]  Pavel Zezula,et al.  A cost model for similarity queries in metric spaces , 1998, PODS '98.

[15]  Myron Flickner,et al.  Query by Image and Video Content , 1995 .

[16]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[17]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[18]  Ambuj K. Singh,et al.  Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.