Seamlessly integrating similarity queries in SQL

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBMS. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright © 2008 John Wiley & Sons, Ltd.

[1]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[2]  George Tzanetakis,et al.  Automatic Musical Genre Classification of Audio Signals , 2001, ISMIR.

[3]  Christian Böhm,et al.  The k-Nearest Neighbour Join: Turbo Charging the KDD Process , 2004, Knowledge and Information Systems.

[4]  Michael J. Carey,et al.  Reducing the Braking Distance of an SQL Query Engine , 1998, VLDB.

[5]  Agma J. M. Traina,et al.  Querying complex objects by similarity in SQL , 2005, SBBD.

[6]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[7]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jim Melton,et al.  SQL multimedia and application packages (SQL/MM) , 2001, SGMD.

[9]  Nikolaos D. Doulamis,et al.  Evaluation of relevance feedback schemes in content-based in retrieval systems , 2006, Signal Process. Image Commun..

[10]  Agma J. M. Traina,et al.  Retrieval by content of medical images using texture for tissue identification , 2003, 16th IEEE Symposium Computer-Based Medical Systems, 2003. Proceedings..

[11]  George Tzanetakis,et al.  Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[12]  Agma J. M. Traina,et al.  An efficient framework for similarity query optimization , 2007, GIS.

[13]  Peng Jia-xiong,et al.  Invariance analysis of improved Zernike moments , 2002 .

[14]  Marco Patella,et al.  Searching in metric spaces with user-defined and approximate distances , 2002, TODS.

[15]  Christos Faloutsos,et al.  Efficient processing of complex similarity queries in RDBMS through query rewriting , 2006, CIKM '06.

[16]  Hanan Samet,et al.  Index-driven similarity search in metric spaces (Survey Article) , 2003, TODS.

[17]  Agma J. M. Traina,et al.  SIREN: a similarity retrieval engine for complex data , 2006, VLDB.

[18]  Kyriakos Mouratidis,et al.  Group nearest neighbor queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[19]  Christos Faloutsos,et al.  Fast Indexing and Visualization of Metric Data Sets using Slim-Trees , 2002, IEEE Trans. Knowl. Data Eng..

[20]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.