View-Based Tree-Language Rewritings for XML

We study query rewriting using views QRV for XML. Our queries and views are regular tree languages RTLs represented by tree automata over marked alphabets, where the markers serve as "node selectors". We formally define query rewriting using views for RTLs and give an automata-based algorithm to compute the maximally contained rewriting. The formalism we use is equal in power with Monadic Second Order MSO logic, and our algorithm for computing QRV is the first to target this expressive class. Furthermore we prove a tight lower bound, thus showing that our algorithm is optimal. Another strength of our automata-based approach is that we are able to cast computing QRV into executing a sequence of intuitive operations on automata, thus rendering our approach practical as it can be easily implemented utilizing off-the-shelf automata toolboxes. Finally, we generalize our framework to account for more complex queries in the spirit of the FOR clause in XQuery. For this generalization as well, we give an optimal algorithm for computing the maximally contained rewriting of queries using views.

[1]  Cong Yu,et al.  Constraint-based XML query rewriting for data integration , 2004, SIGMOD '04.

[2]  Thomas Schwentick,et al.  Query automata , 1999, PODS '99.

[3]  Marcelo Arenas,et al.  Combining Temporal Logics for Querying XML Documents , 2007, ICDT.

[4]  Thomas Schwentick On Diving in Trees , 2000, MFCS.

[5]  Thomas Schwentick,et al.  Query automata over finite trees , 2002, Theor. Comput. Sci..

[6]  Diego Calvanese,et al.  Node Selection Query Languages for Trees , 2010, AAAI.

[7]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.

[8]  Laks V. S. Lakshmanan,et al.  On Testing Satisfiability of Tree Pattern Queries , 2004, VLDB.

[9]  Frank Neven,et al.  Expressiveness of structured document query languages based on attribute grammars , 2002, J. ACM.

[10]  Joachim Niehren,et al.  Tree Automata , 2005 .

[11]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[12]  Jiang Li,et al.  Answering tree pattern queries using views: a revisit , 2011, EDBT/ICDT '11.

[13]  Giuseppe Castagna Patterns and types for querying XML , 2005 .

[14]  Helmut Seidl,et al.  Exact XML Type Checking in Polynomial Time , 2007, ICDT.

[15]  Georg Gottlob,et al.  Conjunctive queries over trees , 2004, JACM.

[16]  James W. Thatcher,et al.  Generalized finite automata theory with an application to a decision problem of second-order logic , 1968, Mathematical systems theory.

[17]  Alin Deutsch,et al.  XPath Rewriting Using Multiple Views: Achieving Completeness and Efficiency , 2008, WebDB.

[18]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[19]  Diego Calvanese,et al.  An Automata-Theoretic Approach to Regular XPath , 2009, DBPL.

[20]  Hubert Comon,et al.  Tree automata techniques and applications , 1997 .

[21]  Laks V. S. Lakshmanan,et al.  Answering tree pattern queries using views , 2006, VLDB.

[22]  Mogens Nielsen,et al.  Mathematical Foundations of Computer Science 2000 , 2001, Lecture Notes in Computer Science.

[23]  Georg Gottlob,et al.  Monadic datalog and the expressive power of languages for web information extraction , 2002, JACM.

[24]  Joachim Niehren,et al.  Composing Monadic Queries in Trees , 2006, PLAN-X.

[25]  Frank Neven Design and analysis of query languages for structured documents. A formal and logical approach , 1999 .

[26]  Georg Gottlob,et al.  The Lixto data extraction project: back and forth between theory and practice , 2004, PODS.

[27]  Georg Gottlob,et al.  Monadic queries over tree-structured data , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[28]  Martin Grohe,et al.  The complexity of first-order and monadic second-order logic revisited , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[29]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ICFP '03.

[30]  Alin Deutsch,et al.  Rewriting nested XML queries using nested views , 2006, SIGMOD Conference.

[31]  Srinivasan Venkatesh,et al.  Rewriting of visibly pushdown languages for xml data integration , 2008, CIKM '08.

[32]  M. de Rijke,et al.  PDL for ordered trees , 2005, J. Appl. Non Class. Logics.

[33]  Thomas Schwentick,et al.  On the complexity of XPath containment in the presence of disjunction, DTDs, and variables , 2006, Log. Methods Comput. Sci..

[34]  Joachim Niehren,et al.  Learning Node Selecting Tree Transducer from Completely Annotated Examples , 2004, ICGI.

[35]  Wenfei Fan,et al.  Rewriting Regular XPath Queries on XML Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[36]  Z. Meral Özsoyoglu,et al.  Rewriting XPath Queries Using Materialized Views , 2005, VLDB.

[37]  Rada Chirkova,et al.  On rewriting XPath queries using views , 2009, EDBT '09.

[38]  Benjamin C. Pierce,et al.  Regular expression pattern matching for XML , 2003, J. Funct. Program..

[39]  Ioana Manolescu,et al.  Structured Materialized Views for XML Queries , 2007, VLDB.

[40]  Balder ten Cate,et al.  XPath, transitive closure logic, and nested tree walking automata , 2008, PODS.

[41]  Cristina Sirangelo,et al.  Reasoning about XML with Temporal Logics and Automata , 2008, LPAR.

[42]  Georg Gottlob,et al.  Visual Web Information Extraction with Lixto , 2001, VLDB.

[43]  Hamid Pirahesh,et al.  A Framework for Using Materialized XPath Views in XML Query Processing , 2004, VLDB.

[44]  Thomas Schwentick,et al.  Automata for XML - A survey , 2007, J. Comput. Syst. Sci..

[45]  Christoph Koch,et al.  Query evaluation on compressed trees , 2003, 18th Annual IEEE Symposium of Logic in Computer Science, 2003. Proceedings..

[46]  Xin Wang,et al.  Answering graph pattern queries using views , 2006, 2014 IEEE 30th International Conference on Data Engineering.

[47]  Thomas Schwentick,et al.  Expressive and efficient pattern languages for tree-structured data (extended abstract) , 2000, PODS '00.