biXid: a bidirectional transformation language for XML

Often, independent organizations define and advocate different XML formats for a similar purpose and, as a result, application programs need to mutually convert between such formats. Existng XML transformation languages, such as XSLT and XDuce, are unsatisfactory for this purpose since we would have to write, e.g., two programs for the forward and the backward transformations in case of two formats, incur high developing and maintenance costs.This paper proposes the bidirectional XML transformation language biXid, allowing us to write only one program for both directions of conversion. Our language adopts a common paradigm programming-by-relation, where a program defines a relation over documents and transforms a document to another in a way satisfying this relation. Our contributions here are specific language features for facilitating realistic conversions whose target formats are loosely in parallel but have many discrepancies in details. Concretely, we (1) adopt XDuce-style regular expression patterns for describing and analyzing XML structures, (2) fully permit ambiguity for treating formats that do not have equivalent expressivenesses, and (3) allow non-linear pattern variables for expressing non-trivial transformations that cannot be written only with linear patterns, such as conversion between unordered and ordered data.We further develop an efficient evaluation algorithm for biXid, consisting of the "parsing" phase that transforms the input document to an intermediate "parse tree" structure and the "unparsing" phase that transforms it to an output document. Both phases use a variant of finite tree automata for performing a one-pass scan on the input or the parse tree by using a standard technique that "maintains the set of all transitable states." However, the construction of the "unparsing" phase is challenging since ambiguity causes different ways of consuming the parse tree and thus results in multiple possible outputs that may have different structures.We have implemented a prototype system of biXid and confirmed that it has enough expressiveness and a linear-time performance from experiments with several realistic bidirectional transformations including one between vCard-XML and ContactXML.

[1]  C. Thompson Special Interest Group , 1995 .

[2]  S-C. Mu,et al.  An algebraic approach to bidirectional updating , 2004 .

[3]  Benjamin C. Pierce,et al.  Xduce: a typed xml processing language , 1997 .

[4]  Claus Brabrand,et al.  Dual syntax for XML languages , 2005, Inf. Syst..

[5]  Randal E. Bryant,et al.  Symbolic Boolean manipulation with ordered binary-decision diagrams , 1992, CSUR.

[6]  Mário Florido,et al.  Type-Based XML Processing in Logic Programming , 2003, PADL.

[7]  Makoto Murata,et al.  Boolean operations and inclusion test for attribute-element constraints , 2006, Theor. Comput. Sci..

[8]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[9]  Ramez Elmasri,et al.  Fundamentals of Database Systems, 2nd Edition , 1994 .

[10]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ACM SIGPLAN Notices.

[11]  David C. Fallside,et al.  Xml schema part 0: primer , 2000 .

[12]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2000, TOPL.

[13]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.

[14]  Benjamin C. Pierce,et al.  XDuce: A Typed XML Processing Language (Preliminary Report) , 2000, WebDB.

[15]  Benjamin C. Pierce,et al.  Regular expression pattern matching for XML , 2003, POPL '01.

[16]  M. W. Shields An Introduction to Automata Theory , 1988 .

[17]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[18]  Benjamin C. Pierce,et al.  Combinators for bi-directional tree transformations: a linguistic approach to the view update problem , 2005, POPL '05.

[19]  Makoto Murata,et al.  Boolean Operations for Attribute-Element Constraints , 2003, CIAA.

[20]  Victor W. Marek Book review: The Art of Prolog Advanced Programming Techniques by L. Sterling and E. Shapiro (The MIT Press) , 1988, SGAR.

[21]  Ramez Elmasri,et al.  Fundamentals of Database Systems, 5th Edition , 2006 .

[22]  Shin-Cheng Mu,et al.  An Algebraic Approach to Bi-directional Updating , 2004, APLAS.

[23]  Leon Sterling,et al.  The Art of Prolog - Advanced Programming Techniques , 1986 .

[24]  Haruo Hosoya,et al.  Regular expression pattern matching---a simpler design , 2003 .