SilkRoute: A framework for publishing relational data in XML

XML is the "lingua franca" for data exchange between interenterprise applications. In this work, we describe SilkRoute, a framework for publishing relational data in XML. In SilkRoute, relational data is published in three steps: the relational tables are presented to the database administrator in a canonical XML view; the database administrator defines in the XQuery query language a public, virtual XML view over the canonical XML view; and an application formulates an XQuery query over the public view. SilkRoute composes the application query with the public-view query, translates the result into SQL, executes this on the relational engine, and assembles the resulting tuple streams into an XML document. This work makes some key contributions to XML query processing. First, it describes an algorithm that translates an XQuery expression into SQL. The translation depends on a query representation that separates the structure of the output XML document from the computation that produces the document's content. The second contribution addresses the optimization problem of how to decompose an XML view over a relational database into an optimal set of SQL queries. We define formally the optimization problem, describe the search space, and propose a greedy, cost-based optimization algorithm, which obtains its cost estimates from the relational engine. Experiments confirm that the algorithm produces queries that are nearly optimal.

[1]  Udb Xml Extender Xml extender administration and programming , 1999 .

[2]  A BernsteinPhilip,et al.  Computational problems related to the design of normal form relational schemas , 1979 .

[3]  Diego Calvanese,et al.  Lossless regular views , 2002, PODS.

[4]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[5]  Hamid Pirahesh,et al.  A rule engine for query transformation in Starburst and IBM DB2 C/S DBMS , 1997, Proceedings 13th International Conference on Data Engineering.

[6]  Hamid Pirahesh,et al.  Efficiently publishing relational data as XML documents , 2001, The VLDB Journal.

[7]  Sumit Ganguly,et al.  Optimizing View Queries in ROLEX to Support Navigable Result Trees , 2002, VLDB.

[8]  P. K. Kannan,et al.  Pricing of Information Products on Online Servers: Issues, Models, and Analysis , 2002, Manag. Sci..

[9]  H. Varian,et al.  VERSIONING: THE SMART WAY TO SELL INFORMATION , 1998 .

[10]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[11]  Bruce Schneier,et al.  Secrets and Lies: Digital Security in a Networked World , 2000 .

[12]  Eugene J. Shekita,et al.  Querying XML Views of Relational Data , 2001, VLDB.

[13]  Michael J. Carey,et al.  XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , 2000, VLDB.

[14]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[15]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[16]  Arnaud Sahuguet Everything You Ever Wanted to Know About DTDs, But Were Afraid to Ask , 2000, WebDB.

[17]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[18]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[19]  Byron Choi,et al.  The XQuery Formal Semantics: A Foundation for Implementation and Optimization , 2002 .

[20]  Stéphane Grumbach,et al.  On the content of materialized aggregate views , 2003, J. Comput. Syst. Sci..

[21]  Robin Milner,et al.  Definition of standard ML , 1990 .

[22]  Philip A. Bernstein,et al.  Computational problems related to the design of normal form relational schemas , 1979, TODS.

[23]  Yannis Papakonstantinou,et al.  Object Fusion in Mediator Systems , 1996, VLDB.

[24]  Yannis Papakonstantinou,et al.  Query rewriting for semistructured data , 1999, SIGMOD '99.

[25]  Alfonso F. Cardenas,et al.  Data base management systems (2nd ed.) , 1985 .