Reformulation of XML Queries and Constraints

We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to enhance performance. The correspondence between published and proprietary schemas is specified by views in both directions, and the same algorithm performs rewriting-with-views, composition-with-views, or the combined effect of both, unifying the Global-As-View and Local-As-View approaches to data integration. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. We solve the reformulation problem for constraints by exploiting a reduction to the problem of query reformulation.

[1]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[2]  Donald D. Chamberlin,et al.  XQuery: a query language for XML , 2003, SIGMOD '03.

[3]  Eugene J. Shekita,et al.  Querying XML Views of Relational Data , 2001, VLDB.

[4]  Michael J. Carey,et al.  XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , 2000, VLDB.

[5]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[6]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[7]  Jarek Gryz,et al.  Query folding with inclusion dependencies , 1998, Proceedings 14th International Conference on Data Engineering.

[8]  Diego Calvanese,et al.  View-based query processing for regular path queries with inverse , 2000, PODS '00.

[9]  Andrea Calì,et al.  Models for Information Integration: Turning Local-as-View Into Global-as-View , 2001 .

[10]  Todd D. Millstein,et al.  Navigational Plans For Data Integration , 1999, AAAI/IAAI.

[11]  Val Tannen,et al.  Object/relational query optimization with chase and backchase , 2000 .

[12]  Jonathan Goldstein,et al.  Optimizing queries using materialized views: a practical, scalable solution , 2001, SIGMOD '01.

[13]  Wenfei Fan,et al.  Keys for XML , 2001, WWW '01.

[14]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[15]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[16]  Alin Deutsch,et al.  Optimization Properties for Classes of Conjunctive Regular Path Queries , 2001, DBPL.

[17]  Marvin H. Solomon,et al.  The GMAP: a versatile tool for physical data independence , 1996, The VLDB Journal.

[18]  Dan Suciu,et al.  SilkRoute: trading between relations and XML , 2000, Comput. Networks.

[19]  Ioana Manolescu,et al.  Answering XML Queries on Heterogeneous Data Sources , 2001, VLDB.

[20]  Masatoshi Yoshikawa,et al.  ILOG: Declarative Creation and Manipulation of Object Identifiers , 1990, VLDB.

[21]  Divesh Srivastava,et al.  Answering Queries Using Views. , 1999, PODS 1995.

[22]  Alon Y. Halevy,et al.  MiniCon: A scalable algorithm for answering queries using views , 2000, The VLDB Journal.

[23]  Alfred V. Aho,et al.  Efficient optimization of a class of relational expressions , 1978, SIGMOD Conference.

[24]  Dan Suciu,et al.  Efficient evaluation of XML middle-ware queries , 2001, SIGMOD '01.

[25]  Yannis Papakonstantinou,et al.  Query rewriting for semistructured data , 1999, SIGMOD '99.

[26]  Alin Deutsch,et al.  Containment and Integrity Constraints for XPath Fragments , 2001 .

[27]  Alin Deutsch,et al.  Physical Data Independence, Constraints, and Optimization with Universal Plans , 1999, VLDB.

[28]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[29]  Diego Calvanese,et al.  Rewriting of regular expressions and regular path queries , 1999, PODS '99.

[30]  LarsonPer-Åke,et al.  Optimizing queries using materialized views , 2001 .

[31]  Alin Deutsch,et al.  Xml query reformulation over mixed and redundant storage , 2002 .

[32]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[33]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.