Combining Fuzzy Information from Multiple Systems

In a traditional database system, the result of a query is a set of values (those values that satisfy the query). In other data servers, such as a system with queries based on image content, or many text retrieval systems, the result of a query is a sorted list. For example, in the case of a system with queries based on image content, the query might ask for objects that are a particular shade of red, and the result of the query would be a sorted list of objects in the database, sorted by how well the color of the object matches that given in the query. A multimedia system must somehow synthesize both types of queries (those whose result is a set and those whose result is a sorted list) in a consistent manner. In this paper we discuss the solution adopted by Garlic, a multimedia information system being developed at the IBM Almaden Research Center. This solution is based on “graded” (or “fuzzy”) sets. Issues of efficient query evaluation in a multimedia system are very different from those in a traditional database system. This is because the multimedia system receives answers to subqueries from various subsystems, which can be accessed only in limited ways. For the important class of queries that are conjunctions of atomic queries (where each atomic query might be evaluated by a different subsystem), the naive algorithm must retrieve a number of elements that is linear in the database size. In contrast, in this paper an algorithm is given, which has been implemented in Garlic, such that if the conjuncts are independent, then with arbitrarily high probability, the total number of elements retrieved in evaluating the query is sublinear in the database size (in the case of two conjuncts, it is of the order of the square root of the database size). It is also shown that for such queries, the algorithm is optimal. The matching upper and lower bounds are robust, in the sense that they hold under almost any reasonable rule (including the standard min rule of fuzzy logic) for evaluating the conjunction. Finally, we find a query that is provably hard, in the sense that the naive linear algorithm is essentially optimal.

[1]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[2]  Laura M. Haas,et al.  Querying Multimedia Data from Multiple Repositories by Content: the Garlic Project , 1995, VDB.

[3]  Torben Hagerup,et al.  A Guided Tour of Chernoff Bounds , 1990, Inf. Process. Lett..

[4]  Leslie G. Valiant,et al.  Fast probabilistic algorithms for hamiltonian circuits and matchings , 1977, STOC '77.

[5]  Didier Dubois,et al.  Fuzzy sets and systems ' . Theory and applications , 2007 .

[6]  Ronald Fagin,et al.  Incorporating User Preferences in Multimedia Queries , 1997, ICDT.

[7]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[8]  Ahmed Ait-Bouziad,et al.  An Inproved Algorithm for Retrieving Fuzzy Information from Two Systems , 1998, Inf. Process. Lett..

[9]  Roy Goetschel,et al.  A note on the characterization of the max and min operators , 1983, Inf. Sci..

[10]  H. Zimmermann,et al.  On the suitability of minimum and product operators for the intersection of fuzzy sets , 1979 .

[11]  Didier Dubois,et al.  A review of fuzzy set aggregation connectives , 1985, Inf. Sci..

[12]  Laura M. Haas,et al.  Using Fagin's algorithm for merging ranked results in multimedia middleware , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[13]  Hans-Jürgen Zimmermann,et al.  Fuzzy set theory , 1992 .

[14]  Luis Gravano,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD 1996.

[15]  Piero P. Bonissone,et al.  Selecting Uncertainty Calculi and Granularity: An Experiment in Trading-off Precision and Complexity , 1985, UAI.

[16]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[17]  John R. Smith,et al.  Searching for Images and Videos on the World-Wide Web , 1999 .

[18]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[19]  Richard Bellman,et al.  On the Analytic Formalism of the Theory of Fuzzy Sets , 1973, Inf. Sci..

[20]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[21]  M. Mizumoto Pictorial representations of fuzzy connectives, part I: cases of t-norms, t-conorms and averaging operators , 1989 .

[22]  C. Alsina On a family of connectives for fuzzy sets , 1985 .

[23]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.