Scalable distributed query and update service implementations for XML document elements

Materialization and indexing of XML views is an important issue, and a number of 'XML-to-RDBMS' mappings have been proposed. To build a scalable system for XML document storage where those views are up-to-date, our approach is to provide efficient implementations of basic services for query processing and updates. As a first contribution we define such services, and show that they support many 'XML-to-RDBMS' mappings. As a second contribution, we describe implementations of such services in a database cluster. We evaluate two implementations, one based on a two-level logging and isolation mechanism and one using Windows 2000 COM+ transactions. The main results are as follows: the first approach scales linearly for updates, and query response times are interactive. For example, with high workloads of 100 concurrent clients and 8 cluster components, query response times are 3 seconds on average. The alternative approach is weaker both regarding query response times (10 seconds on average for the above scenario) and scalability.

[1]  Ophir Frieder,et al.  Integrating structured data and text: a relational approach , 1997 .

[2]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[3]  Gerhard Weikum,et al.  Principles and realization strategies of multilevel transaction management , 1991, TODS.

[4]  Samuel DeFazio Overview of the Full-Text Document Retrieval Benchmark , 1993, The Benchmark Handbook.

[5]  Oscar H. Ibarra,et al.  Toward a Scalable Distributed {WWW} Server on Workstation Clusters , 1997, J. Parallel Distributed Comput..

[6]  Hans-Jörg Schek,et al.  High-level parallelisation in a database cluster: a feasibility study using document services , 2001, Proceedings 17th International Conference on Data Engineering.

[7]  K. Bohm On extending the XML engine with query-processing capabilities , 2000, Proceedings IEEE Advances in Digital Libraries 2000.

[8]  Yuri Breitbart,et al.  Unifying Concurrency Control and Recovery of Transactions with Semantically Rich Operations , 1998, Theor. Comput. Sci..

[9]  Daniel Andresen,et al.  Scalability issues for high performance digital libraries on the World Wide Web , 1996, Proceedings of the Third Forum on Research and Technology Advances in Digital Libraries,.

[10]  Sharad Mehrotra,et al.  The Gold Text Indexing Engine , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[11]  Steve Kirsch Infoseek's experiences searching the internet , 1998, SIGF.

[12]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[13]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[14]  Torsten Grabs,et al.  A document engine on a db cluster , 1999 .

[15]  Hans-Jörg Schek,et al.  Extending TP-monitors for intra-transaction parallelism , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[16]  Patrick Valduriez,et al.  Transaction chopping: algorithms and performance studies , 1995, TODS.

[17]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[18]  Krithi Ramamritham,et al.  Efficient transaction support for dynamic information retrieval systems , 1996, SIGIR '96.

[19]  Torsten Grabs,et al.  A Parallel Document Engine Built on Top of a Cluster of Databases - Design, Implementation, and Experiences - , 2000, ICDE 2000.

[20]  Ioana Manolescu,et al.  Integrating Keyword Search into XML Query Processing , 2000, BDA.

[21]  Vishu Krishnamurthy,et al.  Performance Challenges in Object-Relational DBMSs , 1999, IEEE Data Eng. Bull..