Machine learning for online query relaxation

In this paper we provide a fast, data-driven solution to the failing query problem: given a query that returns an empty answer, how can one relax the query's constraints so that it returns a non-empty set of tuples? We introduce a novel algorithm, loqr, which is designed to relax queries that are in the disjunctive normal form and contain a mixture of discrete and continuous attributes. loqr discovers the implicit relationships that exist among the various domain attributes and then uses this knowledge to relax the constraints from the failing query.In a first step, loqr uses a small, randomly-chosen subset of the target database to learn a set of decision rules that predict whether an attribute's value satisfies the constraints in the failing query; this query-driven operation is performed online for each failing query. In the second step, loqr uses nearest-neighbor techniques to find the learned rule that is the most similar to the failing query; then it uses the attributes' values from this rule to relax the failing query's constraints. Our experiments on six application domains show that loqr is both robust and fast: it successfully relaxes more than 95% of the failing queries, and it takes under a second for processing queries that consist of up to 20 attributes (larger queries of up to 93 attributes are processed in several seconds).

[1]  Surajit Chaudhuri Generalization and a framework for query modification , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[2]  Qiming Chen,et al.  Query answering via cooperative data inference , 2004, Journal of Intelligent Information Systems.

[3]  Jürgen M. Janas Towards More Informative User Interfaces , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[4]  Francisco Corella,et al.  Cooperative responses to boolean queries , 1984, 1984 IEEE First International Conference on Data Engineering.

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Gerhard Weikum,et al.  Adding Relevance to XML , 2000, WebDB.

[7]  Amihai Motro FLEX: A Tolerant and Cooperative User Interface to Databases , 1990, IEEE Trans. Knowl. Data Eng..

[8]  Parke Godfrey,et al.  Minimization in Cooperative Response to Failing Database Queries , 1994, Int. J. Cooperative Inf. Syst..

[9]  Qiming Chen,et al.  Using type inference and induced rules to provide intensional answers , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[10]  Hans-Peter Kriegel,et al.  VisDB: database exploration using multidimensional visualization , 1994, IEEE Computer Graphics and Applications.

[11]  Sihem Amer-Yahia,et al.  Tree Pattern Relaxation , 2002, EDBT.

[12]  Wesley W. Chu,et al.  An error-based conceptual clustering method for providing approximate query answers , 1996, CACM.

[13]  Sharad Mehrotra,et al.  Evaluating refined queries in top-k retrieval systems , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Dongwon Lee,et al.  Query relaxation for xml model , 2002 .

[15]  Hua Yang,et al.  CoBase: A scalable and extensible cooperative information system , 1996, Journal of Intelligent Information Systems.

[16]  Yehoshua Sagiv,et al.  Flexible queries over semistructured data , 2001, PODS '01.

[17]  Annie Gal,et al.  COOPERATIVE RESPONSES IN DEDUCTIVE DATABASES , 1988 .

[18]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[19]  Terry Gaasterland Cooperative Answering through Controlled Query Relaxation , 1997, IEEE Expert.

[20]  Wesley W. Chu,et al.  Pattern-based clustering for database attribute values , 1993 .

[21]  Amihai Motro SEAVE: a mechanism for verifying user presuppositions in query systems , 1986, TOIS.

[22]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[23]  Nick Cercone,et al.  Providing Quality Responses with Natural Language Interfaces: The Null Value Problem , 1988, IEEE Trans. Software Eng..