Rewriting of visibly pushdown languages for XML data integration

Abstract In this work, we focus on XML data integration by studying rewritings of XML target schemas in terms of source schemas. Rewriting is very important in data integration systems where the system is asked to find and assemble XML documents from the data sources and produce documents that satisfy a target schema. As schema representation, we consider Visibly Pushdown Automata (VPAs), which accept Visibly Pushdown Languages (VPLs). The latter have been shown to coincide with the family of (word-encoded) regular tree languages, which are the basis of formalisms for specifying XML schemas. Furthermore, practical semi-formal XML schema specifications (defined by simple pattern conditions on XML) compile into VPAs that are exponentially more concise than other representations based on tree automata. Notably, VPLs enjoy a “well-behavedness” that facilitates us in addressing rewriting problems for XML data integration. Based on VPAs, we positively solve these problems, and present detailed complexity analyses.