Evaluating refined queries in top-k retrieval systems

In many applications, users specify target values for certain attributes/features without requiring exact matches to these values in return. Instead, the result is typically a ranked list of "top k" objects that best match the specified feature values. User subjectivity is an important aspect of such queries, i.e., which objects are relevant to the user and which are not depends on the perception of the user. Due to the subjective nature of top-k queries, the answers returned by the system to an user query often do not satisfy the users need right away, either because the weights and the distance functions associated with the features do not accurately capture the users perception or because the specified target values do not fully capture her information need or both. In such cases, the user would like to refine the query and resubmit it in order to get back a better set of answers. While there has been a lot of research on query refinement models, there is no work that we are aware of on supporting refinement of top-k queries efficiently in a database system. Done naively, each "refined" query can be treated as a "starting" query and evaluated from scratch. We explore alternative approaches that significantly improve the cost of evaluating refined queries by exploiting the observation that the refined queries are not modified drastically from one iteration to another. Our experiments over a real-life multimedia data set show that the proposed techniques save more than 80 percent of the execution cost of refined queries over the naive approach and is more than an order of magnitude faster than a simple sequential scan.

[1]  K. Chakrabarti Query Reenement for Content Based Multimedia Retrieval in Mars , 1999 .

[2]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[3]  Sharad Mehrotra,et al.  The hybrid tree: an index structure for high dimensional feature spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[4]  Amihai Motro,et al.  VAGUE: a user interface to relational databases that permits vague queries , 1988, TOIS.

[5]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[6]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[7]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[8]  Raj Jain,et al.  Algorithms and strategies for similarity retrieval , 1996 .

[9]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[10]  Chad Carson,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD '96.

[11]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[12]  Surya Nepal,et al.  Query processing issues in image (multimedia) databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Sharad Mehrotra,et al.  Similarity Search Using Multiple Examples in MARS , 1999, VISUAL.

[14]  Thomas S. Huang,et al.  Relevance feedback techniques in interactive content-based image retrieval , 1997, Electronic Imaging.

[15]  Christos Faloutsos,et al.  FALCON: Feedback Adaptive Loop for Content-Based Retrieval , 2000, VLDB.

[16]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[17]  Kriengkrai Porkaew,et al.  Query refinement for multimedia similarity retrieval in MARS , 1999, MULTIMEDIA '99.

[18]  Hans-Peter Kriegel,et al.  Efficient User-Adaptable Similarity Search in Large Multimedia Databases , 1997, VLDB.

[19]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[20]  Prashant J. Shenoy,et al.  Rules of thumb in data engineering , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[21]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[22]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[23]  Luis Gravano,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD 1996.

[24]  Sharad Mehrotra,et al.  Efficient Query Refinement in Multimedia Databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[25]  Ronald Fagin,et al.  Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[26]  B. Reljin,et al.  Adaptive Content-Based Image Retrieval with Relevance Feedback , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[27]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[28]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[29]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[30]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[31]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[32]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.