Impact of XML Schema Evolution

We consider the problem of XML Schema evolution. In the ever-changing context of the web, XML schemas continuously change in order to cope with the natural evolution of the entities they describe. Schema changes have important consequences. First, existing documents valid with respect to the original schema are no longer guaranteed to fulfill the constraints described by the evolved schema. Second, the evolution also impacts programs, manipulating documents whose structure is described by the original schema. We propose a unifying framework for determining the effects of XML Schema evolution both on the validity of documents and on queries. The system is very powerful in analyzing various scenarios in which forward/backward compatibility of schemas is broken, and in which the result of a query may no longer be what was expected. Specifically, the system offers a predicate language that allows one to formulate properties related to schema evolution. The system then relies on exact reasoning techniques to perform a fine-grained analysis. This yields either a formal proof of the property or a counter-example that can be used for debugging purposes. The system has been fully implemented and tested with real-world use cases, in particular with the main standard document formats used on the web, as defined by W3C. The system precisely identifies compatibility relations between document formats. In case these relations do not hold, the system can identify queries that must be reformulated in order to produce the expected results across successive schema versions.

[1]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[2]  Wolfgang Thomas,et al.  Automata on Infinite Objects , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[3]  Kristoffer Høgsbro Rose The XML world view , 2004, DocEng '04.

[4]  Paolo Manghi,et al.  Types for path correctness of XML queries , 2004, ICFP '04.

[5]  Carlo Curino,et al.  Managing and querying transaction-time databases under schema evolution , 2008, Proc. VLDB Endow..

[6]  Pierre Genevès,et al.  Logics for XML , 2008, ArXiv.

[7]  Anders Møller,et al.  The Design Space of Type Checkers for XML Transformation Languages , 2004, ICDT.

[8]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[9]  Pierre Genevès,et al.  XML reasoning made practical , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[10]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2005, ACM Trans. Program. Lang. Syst..

[11]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2000, TOPL.

[12]  Paolo Manghi,et al.  Static analysis for path correctness of XML queries , 2006, J. Funct. Program..

[13]  Pierre Genevès,et al.  Efficient static analysis of XML paths and types , 2007, PLDI '07.

[14]  Sven Groppe,et al.  Filtering unsatisfiable XPath queries , 2006, Data Knowl. Eng..

[15]  Eric Sedlar Managing structure in bits & pieces: the killer use case for XML , 2005, SIGMOD '05.

[16]  Benjamin C. Pierce,et al.  Statically Typed Document Transformation: An Xtatic Experience , 2006, PLAN-X.

[17]  Sven Groppe,et al.  XPath Query Simplification with regard to the Elimination of Intersect and Except Operators , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[18]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ACM SIGPLAN Notices.

[19]  Fatma Özcan,et al.  DB2/XML: designing for evolution , 2005, SIGMOD '05.

[20]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[21]  Cong Yu,et al.  Semantic Adaptation of Schema Mappings when Schemas Evolve , 2005, VLDB.

[22]  Lipyeow Lim,et al.  Preserving XML queries during schema evolution , 2007, WWW '07.

[23]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[24]  Vincent Quint,et al.  Identifying query incompatibilities with evolving XML schemas , 2009, ICFP.

[25]  Giuseppe Castagna,et al.  Typed iterators for XML , 2008, PLAN-X.

[26]  P. Wadler Two semantics for XPath , 2000 .

[27]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.