Flexible integration of multimedia sub-queries with qualitative preferences

Complex multimedia queries, aiming to retrieve from large databases those objects that best match the query specification, are usually processed by splitting them into a set of m simpler sub-queries, each dealing with only some of the query features. To determine which are the overall best-matching objects, a rule is then needed to integrate the results of such sub-queries, i.e., how to globally rank the m-dimensional vectors of matching degrees, or partial scores, that objects obtain on the m sub-queries. It is a fact that state-of-the-art approaches all adopt as integration rule a scoring function, such as weighted average, that aggregates the m partial scores into an overall (numerical) similarity score, so that objects can be linearly ordered and only the highest scored ones returned to the user. This choice however forces the system to compromise between the different sub-queries and can easily lead to miss relevant results. In this paper we explore the potentialities of a more general approach, based on the use of qualitative preferences, able to define arbitrary partial (rather than only linear) orders on database objects, so that a larger flexibility is gained in shaping what the user is looking for. For the purpose of efficient evaluation, we propose two integration algorithms able to work with any (monotone) partial order (thus also with scoring functions): MPO, which delivers objects one layer of the partial order at a time, and iMPO, which can incrementally return one object at a time, thus also suitable for processing top k queries. Our analysis demonstrates that using qualitative preferences pays off. In particular, using Skyline and Region-prioritized Skyline preferences for queries on a real image database, we show that the results we get have a precision comparable to that obtainable using scoring functions, yet they are obtained much faster, saving up to about 70% database accesses.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  George C. Canavos,et al.  Applied probability and statistical methods , 1984 .

[3]  Jan Chomicki,et al.  Querying with Intrinsic Preferences , 2002, EDBT.

[4]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[5]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[6]  Ilaria Bartolini,et al.  A sound algorithm for region-based image retrieval using an index , 2000, Proceedings 11th International Workshop on Database and Expert Systems Applications.

[7]  M. Tamer Özsu,et al.  Integrating the Results of Multimedia Sub-Queries Using Qualitative Preferences , 2004, Multimedia Information Systems.

[8]  Ronald Fagin,et al.  Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[9]  Peter C. Fishburn,et al.  Preference Structures and Their Numerical Representations , 1999, Theor. Comput. Sci..

[10]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[11]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[12]  Chuan Yi Tang,et al.  A 2.|E|-Bit Distributed Algorithm for the Directed Euler Trail Problem , 1993, Inf. Process. Lett..

[13]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[14]  Pavel Zezula,et al.  Processing Complex Similarity Queries with Distance-Based Access Methods , 1998, EDBT.

[15]  Luis Gravano,et al.  Evaluating top-k queries over Web-accessible databases , 2002, Proceedings 18th International Conference on Data Engineering.

[16]  Ronald Fagin,et al.  Efficient similarity search and classification via rank aggregation , 2003, SIGMOD '03.

[17]  Riccardo Torlone,et al.  Which are my preferred items , 2002 .

[18]  Wolf-Tilo Balke,et al.  Multi-objective Query Processing for Database Systems , 2004, VLDB.

[19]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[20]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[21]  M. Basseville Distance measures for signal processing and pattern recognition , 1989 .

[22]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[23]  Hans-Jörg Schek,et al.  Fast Evaluation Techniques for Complex Similarity Queries , 2001, VLDB.

[24]  Ilaria Bartolini,et al.  FeedbackBypass: A New Approach to Interactive Similarity Query Processing , 2001, VLDB.