Semi-structured data management in the enterprise: a nimble, high-throughput, and scalable approach

In this paper we describe an approach and system for managing enterprise semi-structured data that is high-throughput, nimble, and scalable. We present the NETMARK system, which provides for a "schemaless" way of managing semi-structured documents. We describe in particular detail the unique underlying data storage approach and efficient query processing mechanisms given this storage system. We present an extensive benchmark evaluation of the NETMARK system and also compare it with related XML management systems. At the heart of the approach is the philosophy of a focus on most common data management requirements in the enterprise, and not burdening users and application developers with unnecessary complexity and formal schemas.

[1]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[2]  Menzo Windhouwer,et al.  Efficient Relational Storage and Retrieval of XML Documents , 2000, WebDB.

[3]  Jeffrey F. Naughton,et al.  A general technique for querying XML documents using a relational database system , 2001, SGMD.

[4]  David A. Maluf,et al.  NETMARK: A Schema-Less Extension for Relational Databases for Managing Semi-structured Data Dynamically , 2003, ISMIS.

[5]  Eugene J. Shekita,et al.  XTABLES: Bridging relational technology and XML , 2002, IBM Syst. J..

[6]  Michael J. Carey,et al.  XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , 2000, VLDB.

[7]  A. NETMARK : A Schema-less Extension for Relational Databases for Managing Semi-structured Data Dynamically , 2003 .

[8]  Ioana Manolescu,et al.  The XML benchmark project , 2001 .

[9]  Jeffrey F. Naughton,et al.  Bridging relational technology and xml , 2001 .

[10]  Alin Deutsch,et al.  MARS: A System for Publishing XML from Mixed and Redundant Storage , 2003, VLDB.

[11]  Vassilis J. Tsotras,et al.  Twig query processing over graph-structured XML data , 2004, WebDB '04.

[12]  Sudipto Guha,et al.  Approximate XML joins , 2002, SIGMOD '02.

[13]  Harald Schöning Tamino - A DBMS designed for XML , 2001, ICDE.

[14]  Guido Moerkotte,et al.  Efficient Storage of XML Data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[15]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[16]  Dan Suciu,et al.  SilkRoute: A framework for publishing relational data in XML , 2002, TODS.

[17]  Michael Gertz,et al.  XQuery/IR: Integrating XML Document and Data Retrieval , 2002, WebDB.

[18]  Alon Y. Halevy,et al.  An XML query engine for network-bound data , 2002, The VLDB Journal.