Query Refinement in Similarity Retrieval Systems

In many applications, users specify target values for certain attributes/features without requiring exact matches to these values in return. Instead, the result is typically a ranked list of the top k objects that best match the specified feature values. User subjectivity is an important aspect of such queries, i.e., which objects are relevant to the user and which are not depends on the perception of the user. Due to the subjective nature of similarity-based retrieval, the answers returned by the system to a user query often do not satisfy the user’s information need right away; either because the weights and the distance functions associated with the features do not accurately capture the user’s perception or because the specified target values do not fully capture her information need or both. The most commonly used technique to overcome this problem is query refinement. In this technique, the user provides to the system some feedback on the “relevance” of the answers to the user’s query. The system then analyzes the feedback, refines the query (i.e., modifies the weights, distance functions, target values etc.) evaluates it and returns the new results. In this paper, we provide an overview of the techniques used to construct the refined query based on the feedback from the user as well as the techniques to evaluate the refined query efficiently. We present experimental results demonstrating the effectiveness of the techniques discussed in the paper.

[1]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[2]  Divyakant Agrawal,et al.  Retrieving and organizing web pages by “information unit” , 2001, WWW '01.

[3]  Ingemar J. Cox,et al.  An optimized interaction strategy for Bayesian relevance feedback , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[4]  Anthony Jameson,et al.  Adaptive Provision of Evaluation-Oriented Information: Tasks and Techniques , 1995, IJCAI.

[5]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[6]  Markus Stolze,et al.  Effective product selection in electronic catalogs , 1997, CHI Extended Abstracts.

[7]  Robert T. Clemen,et al.  Making Hard Decisions: An Introduction to Decision Analysis , 1997 .

[8]  Yoram Wind,et al.  Multiattribute decisions in marketing : a measurement approach , 1973 .

[9]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[10]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[11]  Rakesh Agrawal,et al.  Continuous querying in database-centric Web applications , 2000, Comput. Networks.

[12]  Yannis Papakonstantinou,et al.  BBQ: A Visual Interface for Integrated Browsing and Querying of XML , 2000, VDB.

[13]  Daphne Koller,et al.  Making Rational Decisions Using Adaptive Utility Elicitation , 2000, AAAI/IAAI.

[14]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[15]  Christos Faloutsos,et al.  Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes , 2000, EDBT.

[16]  James Allan,et al.  The effect of adding relevance information in a relevance feedback environment , 1994, SIGIR '94.

[17]  David N. Chin,et al.  Acquiring User Preferences for Product Customization , 2001, User Modeling.

[18]  Hans-Peter Kriegel,et al.  Efficient User-Adaptable Similarity Search in Large Multimedia Databases , 1997, VLDB.

[19]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[20]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[21]  Ilaria Bartolini,et al.  FeedbackBypass: A New Approach to Interactive Similarity Query Processing , 2001, VLDB.

[22]  Thomas S. Huang,et al.  Relevance feedback techniques in interactive content-based image retrieval , 1997, Electronic Imaging.

[23]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[24]  Christos Faloutsos,et al.  FALCON: Feedback Adaptive Loop for Content-Based Retrieval , 2000, VLDB.

[25]  K. Chakrabarti Query Reenement for Content Based Multimedia Retrieval in Mars , 1999 .

[26]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[27]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[28]  Jacques Savoy,et al.  Retrieval effectiveness on the web , 2001, Inf. Process. Manag..

[29]  Rolf Haenni,et al.  Probabilistic Argumentation Systems , 2003 .

[30]  Klaus D. Heidtmann,et al.  Smaller sums of disjoint products by subproduct inversion , 1989 .

[31]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[32]  Amihai Motro,et al.  VAGUE: a user interface to relational databases that permits vague queries , 1988, TOIS.

[33]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[34]  Ingemar J. Cox,et al.  PicHunter: Bayesian relevance feedback for image retrieval , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[35]  Gergely Lukács Decision support under imperfections in electronic commerce , 2000, Proceedings 11th International Workshop on Database and Expert Systems Applications.

[36]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[37]  Donald D. Chamberlin,et al.  Using the New DB2: IBM's Object-Relational Database System , 1996 .

[38]  Ankur Jain,et al.  Mragyati : A System for Keyword-based Searching in Databases , 2001, ArXiv.

[39]  Sharad Mehrotra,et al.  Similarity Search Using Multiple Examples in MARS , 1999, VISUAL.

[40]  Raj Jain,et al.  Algorithms and strategies for similarity retrieval , 1996 .

[41]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[42]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[43]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[44]  Rakesh Agrawal,et al.  A framework for expressing and combining preferences , 2000, SIGMOD '00.

[45]  Markus Stolze,et al.  Soft navigation in electronic product catalogs , 2000, International Journal on Digital Libraries.

[46]  Ioana Manolescu,et al.  Integrating Keyword Search into XML Query Processing , 2000, BDA.

[47]  Jacques Savoy Ranking Schemes in Hybrid Boolean Systems: A New Approach , 1997, J. Am. Soc. Inf. Sci..

[48]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.

[49]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[50]  Justin Picard,et al.  Modeling and combining evidence provided by document relationships using probabilistic argumentation systems , 1998, SIGIR '98.

[51]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[52]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[53]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[54]  Markus Stolze,et al.  Utility-Based Decision Tree Optimization: A Framework for Adaptive Interviewing , 2001, User Modeling.

[55]  Monique Calisti,et al.  CCL: expressions of choice in agent communication , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[56]  Chanathip Namprempre,et al.  HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering , 1996, HYPERTEXT '96.

[57]  Roy Goldman,et al.  Proximity Search in Databases , 1998, VLDB.

[58]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.

[59]  Sharad Mehrotra,et al.  The hybrid tree: an index structure for high dimensional feature spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[60]  Narain H. Gehani,et al.  Queries in an Object-Oriented Graphical Interface , 1995, J. Vis. Lang. Comput..

[61]  Jacques Savoy,et al.  A Learning Scheme for Information Retrieval in Hypertext , 1994, Inf. Process. Manag..

[62]  Takeo Kanade,et al.  Informedia Digital Video Library , 1995, CACM.

[63]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[64]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[65]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[66]  Laura M. Haas,et al.  PESTO : An Integrated Query/Browser for Object Databases , 1996, VLDB.

[67]  Sharad Mehrotra,et al.  Efficient Query Refinement in Multimedia Databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[68]  Jon M. Kleinberg,et al.  Clustering categorical data: an approach based on dynamical systems , 2000, The VLDB Journal.

[69]  Shaul Dar,et al.  DTL's DataSpot: Database Exploration Using Plain Language , 1998, VLDB.