Query planning with limited source capabilities

In information-integration systems, sources may have diverse and limited query capabilities. We show that because sources have restrictions on retrieving their information, sources not mentioned in a query can contribute to the query result by providing useful bindings. In some cases we can access sources repeatedly to retrieve bindings to answer a query, and query planning thus becomes considerably more challenging. We find all the obtainable answers to a query by translating the query and source descriptions to a simple recursive Datalog program, and evaluating the program on the source relations. This program often accesses sources that are not in the query. Some of these accesses are essential, as they provide bindings that let us query sources, which we could not do otherwise. However, some of these accesses can be proven not to add anything to the query's answer. We show in which cases these off-query accesses are useless, and prove that in these cases we can compute the complete answer to the query by using only the sources in the query. In the cases where off-query accesses are necessary, we propose an algorithm for finding all the useful sources for a query. We thus solve the optimization problem of eliminating the unnecessary source accesses, and optimize the program to answer the query.

[1]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[2]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[3]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[4]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[5]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Anand Rajaraman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS.

[7]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[8]  Xiaolei Qian,et al.  Query folding , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[9]  Michael R. Genesereth,et al.  Query planning and optimization in information integration , 1997 .

[10]  Gio Wiederhold,et al.  Abstraction of Representation for Interoperation , 1997, ISMIS.

[11]  Michael R. Genesereth,et al.  Infomaster: an information integration system , 1997, SIGMOD '97.

[12]  Alon Y. Halevy,et al.  Recursive Plans for Information Gathering , 1997, IJCAI.

[13]  Vipul Kashyap,et al.  InfoSleuth: Semantic Integration of Information in Open and Dynamic Environments (Experience Paper) , 1997, SIGMOD Conference.

[14]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[15]  Yannis Papakonstantinou,et al.  Describing and Using Query Capabilities of Heterogeneous Sources , 1997, VLDB.

[16]  Mary Roth,et al.  Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources , 1997, VLDB.

[17]  Hector Garcia-Molina,et al.  Template-based wrappers in the TSIMMIS system , 1997, SIGMOD '97.

[18]  Patrick Valduriez,et al.  Scaling Access to Heterogeneous Data Sources with DISCO , 1998, IEEE Trans. Knowl. Data Eng..

[19]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[20]  Jeffrey D. Ullman,et al.  Capability based mediation in TSIMMIS , 1998, SIGMOD '98.

[21]  Zachary G. Ives,et al.  An adaptive query execution engine for data integration , 1999 .

[22]  Jeffrey D. Ullman,et al.  Optimizing Large Join Queries in Mediation Systems , 1999, ICDT.

[23]  Ioana Manolescu,et al.  Query optimization in the presence of limited access patterns , 1999, SIGMOD '99.

[24]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[25]  Jeffrey D. Ullman,et al.  Computing capabilities of mediators , 1999, SIGMOD '99.

[26]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[27]  Chen Li,et al.  Planning with Limited Source Capabilities ( Extended Version ) , 2022 .