An Efficient Incremental Lower Bound Approach for Solving Approximate Nearest-Neighbor Problem of Complex Vague Queries

In this paper, we define a complex vague query as a multifeature nearest neighbor query. To answer such queries, the system must search on some feature spaces individually and then combine the results in order to find the final answers. The feature spaces are usually multidimensional and may consist of the sheer volume of data. Therefore searching costs are prohibitively expensive for complex vague queries. For only such a single-feature space, to alleviate the costs, problem of answering nearest neighbor and approximate nearest neighbor queries has been extensively studied and quite well addressed in the literature. This paper, however, introduces an approach for finding (1+?)- approximate nearest neighbors of complex vague queries, which must deal with the problem on multiple feature spaces. This approach is based on a novel, efficient and general algorithm called ISA-Incremental hyper-Sphere Approach [12, 13], which has recently been introduced for solving nearest neighbor problem in the VQS-Vague Query System [22]. To the best of our knowledge, the work presented in this paper is one of the vanguard solutions for generally dealing with problem of approximate multi-feature nearest neighbor queries. The experimental results will prove the efficiency of the proposed approach.

[1]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[2]  Surya Nepal,et al.  Query processing issues in image (multimedia) databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[3]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[4]  Tran Khanh Dang,et al.  ISA - An Incremental Hyper-sphere Approach for Efficiently Solving Complex Vague Queries , 2002, DEXA.

[5]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[6]  Hans-Jörg Schek,et al.  Fast Evaluation Techniques for Complex Similarity Queries , 2001, VLDB.

[7]  Marco Patella,et al.  PAC nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[8]  Hans-Peter Kriegel,et al.  S3: similarity search in CAD database systems , 1997, SIGMOD '97.

[9]  Masahito Hirakawa,et al.  ARES: A relational database with the capability of performing flexible interpretation of queries , 1986, IEEE Transactions on Software Engineering.

[10]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[11]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[12]  A Min Tjoa,et al.  Advanced Query Mechanisms in Tourism Information Systems , 2002, ENTER.

[13]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[14]  Josef Küng,et al.  VQS-a vague query system prototype , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[15]  Laura M. Haas,et al.  Using Fagin's algorithm for merging ranked results in multimedia middleware , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[16]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[17]  Sakti Pramanik,et al.  An efficient searching algorithm for approximate nearest neighbor queries in high dimensions , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[18]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[19]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[20]  Josef Küng,et al.  An Incremental Hypercube Approach for Finding Best Matches for Vague Queries , 1999, DEXA.

[21]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[22]  Tran Khanh Dang,et al.  A General and Efficient Approach for Solving Nearest Neighbor Problem in the Vague Query System , 2002, WAIM.

[23]  Amihai Motro,et al.  VAGUE: a user interface to relational databases that permits vague queries , 1988, TOIS.

[24]  Jon M. Kleinberg,et al.  Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[25]  Tran Khanh Dang,et al.  The SH-tree: A Super Hybrid Index Structure for Multidimensional Data , 2001, DEXA.

[26]  Josef Küng,et al.  Vague joins-an extension of the vague query system VQS , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[27]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[28]  Ronald Fagin,et al.  Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[29]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[30]  John R. Smith,et al.  Supporting Incremental Join Queries on Ranked Inputs , 2001, VLDB.

[31]  Hans-Peter Kriegel,et al.  Efficient User-Adaptable Similarity Search in Large Multimedia Databases , 1997, VLDB.

[32]  Hans-Peter Kriegel,et al.  Efficiently supporting multiple similarity queries for mining in metric databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).