Integrity constraints for XML

Integrity constraints are useful for semantic specification, query optimization and data integration. The ID/IDREF mechanism provided by XML DTDs relics on a simple form of constraint to describe references. Yet, this mechanism is not sufficient to express semantic constraints, such as keys or inverse relationships, or stronger, object-style references. In this paper, we investigate integrity constraints for XML, both for semantic purposes and to improve its current reference mechanism. We extend DTDs with several families of constraints, including key, foreign key, inverse constraints and constraints specifying the semantics of object identities. These constraints are useful both for native XML documents and to preserve the semantics of data originating in relational or object databases. Complexity and axiomatization results are established for the (finite) implication problems associated with these constraints. These results also extend relational dependency theory on the interaction between (primary) keys and foreign keys. In addition, we investigate implication of more general constraints, such as functional, inclusion and inverse constraints defined in terms of navigation paths.

[1]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[2]  Grant E. Weddell,et al.  Reasoning About Equations and Functional Dependencies on Complex Objects , 1994, IEEE Trans. Knowl. Data Eng..

[3]  Alberto O. Mendelzon,et al.  Research Issues in Structured and Semistructured Database Programming , 1999, Lecture Notes in Computer Science.

[4]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[5]  Arvind Malhotra,et al.  Xml schema part 2: datatypes , 1999 .

[6]  David Jordan,et al.  The Object Database Standard: ODMG 2.0 , 1997 .

[7]  Wenfei Fan,et al.  Reasoning about Keys for XML , 2001, DBPL.

[8]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[9]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  Patrick Valduriez,et al.  A Methodology for Query Reformulation in CIS Using Semantic Knowledge , 1996, Int. J. Cooperative Inf. Syst..

[12]  Wenfei Fan,et al.  Interaction between path and type constraints , 1999, PODS '99.

[13]  Alin Deutsch,et al.  Physical Data Independence, Constraints, and Optimization with Universal Plans , 1999, VLDB.

[14]  Wenfei Fan,et al.  On XML integrity constraints in the presence of DTDs , 2001, PODS '01.

[15]  K. Jon Barwise,et al.  On Moschovakis closure ordinals , 1977, Journal of Symbolic Logic.

[16]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[17]  Wenfei Fan,et al.  Query Optimization for Semistructured Data Using Path Constraints in a Deterministic Data Model , 1999, DBPL.

[18]  Jérôme Siméon,et al.  YATL: a Functional and Declarative Language for XML , 2000 .

[19]  Diego Calvanese,et al.  Representing and Reasoning on XML Documents: A Description Logic Approach , 1999, J. Log. Comput..

[20]  Alex Borgiday On the Relative Expressiveness of Description Logics and Predicate Logics , 1996 .

[21]  Philip Wadler,et al.  A Semi-monad for Semi-structured Data , 2001, ICDT.

[22]  Wenfei Fan,et al.  Finite Satisfiability of Keys and Foreign Keys for XML Data , 2000 .

[23]  Moshe Y. Vardi,et al.  Polynomial-time implication problems for unary inclusion dependencies , 1990, JACM.

[24]  Serge Abiteboul,et al.  Tools for Data Translation and Integration , 1999, IEEE Data Eng. Bull..

[25]  Carmem S. Hara,et al.  Reasoning about nested functional dependencies , 1999, PODS '99.

[26]  Phokion G. Kolaitis,et al.  On the Decision Problem for Two-Variable First-Order Logic , 1997, Bulletin of Symbolic Logic.

[27]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, PODS '97.

[28]  Rajshekhar Sunderraman,et al.  XML - Data를 이용한 웹 질의처리 , 2000 .

[29]  J. R. Shoenfield,et al.  Review: Herbert B. Enderton, A Mathematical Introduction to Logic , 1973 .

[30]  Herbert B. Enderton,et al.  A mathematical introduction to logic , 1972 .

[31]  Frank Neven,et al.  Extensions of Attribute Grammars for Structured Document Queries , 1999, DBPL.

[32]  Minoru Ito,et al.  Implication Problems for Functional Constraints on Databases Supporting Complex Objects , 1994, J. Comput. Syst. Sci..

[33]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[34]  Wenfei Fan,et al.  Path Constraints in Semistructured Databases , 2000, J. Comput. Syst. Sci..

[35]  Catriel Beeri,et al.  Schemas for Integration and Translation of Structured and Semi-structured Data , 1999, ICDT.

[36]  David Schach,et al.  XML Query Language (XQL) , 1998, QL.

[37]  Val Tannen,et al.  Object/relational query optimization with chase and backchase , 2000 .

[38]  Alin Deutsch,et al.  XML-QL: A Query Language for XML , 1998 .

[39]  Wenfei Fan,et al.  Keys for XML , 2001, WWW '01.

[40]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[41]  DataWenfei FanTemple Universityfan,et al.  Finite Implication of Keys and Foreign Keys for XML , 2000 .

[42]  Serge Abiteboul,et al.  Object identity as a query language primitive , 1989, SIGMOD '89.

[43]  Wenfei Fan,et al.  Path constraints on semistructured and structured data , 1998, PODS '98.

[44]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[45]  Yuri Gurevich,et al.  The Classical Decision Problem , 1997, Perspectives in Mathematical Logic.