Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems

In data-integration systems, information sources often have various constraints such as “all houses stored at a source have a unique address.” These constraints are very useful to compute answers to queries. In this paper we study how to describe these constraints, so that they can be utilized in query processing and optimization. We consider the local-as-view approach to data integration (under the open-world assumption), in which source contents and user queries are formulated on predefined global predicates. In this approach, source contents and constraints often exist before the global predicates are designed. We discuss two different levels of describing constraints: local constraints are defined sources, while global constraints are on global predicates. We formally define two types of global constraints, namely general global constraints and source-derived global constraints. We present the advantages of having these constraints, and discuss open problems that need more research investigations.

[1]  Michael R. Genesereth,et al.  Answering recursive queries using views , 1997, PODS '97.

[2]  C. V. Ramamoorthy,et al.  Knowledge and Data Engineering , 1989, IEEE Trans. Knowl. Data Eng..

[3]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[4]  Diego Calvanese,et al.  Answering Queries Using Views over Description Logics Knowledge Bases , 2000, AAAI/IAAI.

[5]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[6]  Jarek Gryz,et al.  Query Rewriting Using Views in the Presence of Functional and Inclusion Dependencies , 1999, Inf. Syst..

[7]  Craig A. Knoblock,et al.  Semantic Query Optimization for Query Plans of Heterogeneous Multidatabase Systems , 2000, IEEE Trans. Knowl. Data Eng..

[8]  Calisto Zuzarte,et al.  Exploiting constraint-like data characterizations in query optimization , 2001, SIGMOD '01.

[9]  Tore Risch,et al.  Scalable view expansion in a peer mediator system , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..

[10]  Divesh Srivastava,et al.  Answering Queries Using Views. , 1999, PODS 1995.

[11]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[12]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[13]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[14]  Tok Wang Ling Integrity Constraint Checking in Deductive Databases Using the Prolog Not-Predicate , 1987, Data Knowl. Eng..

[15]  Tok Wang Ling,et al.  Resolving Constraint Conflicts in the Integration of Entity-Relationship Schemas , 1997, ER.

[16]  John Grant,et al.  Logic-based approach to semantic query optimization , 1990, TODS.

[17]  Chen Li,et al.  Answering queries using views with arithmetic comparisons , 2002, PODS '02.

[18]  Hector Garcia-Molina,et al.  Template-based wrappers in the TSIMMIS system , 1997, SIGMOD '97.