Improving query performance on XML documents: a workload-driven design approach

As XML has emerged as a data representation format and as great quantities of data have been stored in the XML format, XML document design has become an important and evident issue in several application contexts. Methodologies based on conceptual modeling are being tightly applied for designing XML documents. However, the conversion of a conceptual schema to an XML schema is a complex process. In many cases, conceptual relationships cannot be represented in a hierarchy so that they have to be represented by reference relationships in the XML schema. The problem is that reference relationships generate a disconnected XML structure and, consequently, produce an overhead cost for query processing on XML documents. This paper presents a design approach for generating XML schemas from conceptual schemas considering the expected workload of the XML applications. Query workload is used to produce XML schemas which minimize the impact of the reference relationships on query performance. We evaluate our approach through a case study where a set of XML documents are redesigned by our methodology. The results demonstrate that query performance is improved in terms of the number of accesses generated by the queries on the XML documents designed by our approach.

[1]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[2]  Diane C. P. Smith,et al.  Database abstractions: aggregation and generalization , 1977, TODS.

[3]  Ramez Elmasri,et al.  The Category Concept: An Extension to the Entity-Relationship Model , 1985, Data Knowl. Eng..

[4]  Kyung-Soo Joo,et al.  Developing a Unified Design Methodology Based on Extended Entity-Relationship Model for XML , 2003, International Conference on Computational Science.

[5]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[6]  Jianxin Li,et al.  Designing Quality XML Schemas from E-R Diagrams , 2006, WAIM.

[7]  H. Schoning Tamino - a DBMS designed for XML , 2001, Proceedings 17th International Conference on Data Engineering.

[8]  Andrew Goodchild,et al.  UML and XML Schema , 2002, Australasian Database Conference.

[9]  Elisa Quintarelli,et al.  An algorithm for generating XML Schemas from ER Schemas , 2005, SEBD.

[10]  Terry A. Halpin,et al.  Object Role Modelling and XML-Schema , 2000, ER.

[11]  David W. Embley,et al.  Generating compact redundancy-free XML documents from conceptual-model hypergraphs , 2006, IEEE Transactions on Knowledge and Data Engineering.

[12]  Joseph Fong,et al.  Translating Relational Schema with Constraints into Xml Schema , 2006, Int. J. Softw. Eng. Knowl. Eng..

[13]  Rainer Eckstein,et al.  XML Conceptual Modeling Using UML , 2000, ER.

[14]  Aoying Zhou,et al.  Dynamic tuning of XML storage schema in VXMLR , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[15]  Shamkant B. Navathe,et al.  Conceptual Database Design: An Entity-Relationship Approach , 1991 .

[16]  Laks V. S. Lakshmanan,et al.  Making Designer Schemas with Colors , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[17]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[18]  Ronaldo dos Santos Mello,et al.  BInXS: A Process for Integration of XML Schemata , 2005, CAiSE.

[19]  L. Stein,et al.  OWL Web Ontology Language - Reference , 2004 .

[20]  Harald Schöning Tamino - A DBMS designed for XML , 2001, ICDE.

[21]  Ronaldo dos Santos Mello,et al.  Conversion of generalization hierarchies and union types from extended entity-relationship model to an XML logical model , 2008, SAC '08.