论文信息 - High-level parallelisation in a database cluster: a feasibility study using document services

High-level parallelisation in a database cluster: a feasibility study using document services

Our concern is the design of a scalable infrastructure for complex application services. We want to find out if a cluster of commodity database systems is well-suited as such an infrastructure. To this end, we have carried out a feasibility study based on document services, e.g. document insertion and retrieval. We decompose a service request into short parallel database transactions. Our system, implemented as an extension of a transaction processing monitor, routes the short transactions to the appropriate database systems in the cluster. Routing depends on the data distribution that we have chosen. To avoid bottlenecks, we distribute document functionality, such as term extraction, over the cluster. Extensive experiments show the following. (1) A relatively small number of components - for example eight components $already suffices to cope with high workloads of more than 100 concurrently active clients. (2) Speedup and throughput increase linearly for insertion operations when increasing the cluster size. These observations also hold when bundling service invocations into transactions at the semantic layer. A specialized coordinator component then implements semantic serializability and atomicity. Our experiments show that such a coordinator has minimal impact on CPU resource consumption and on response times.

Hans-Jörg Schek | Klemens Böhm | Torsten Grabs

[1] Torsten Grabs,et al. A Parallel Document Engine Built on Top of a Cluster of Databases - Design, Implementation, and Experiences - , 2000, ICDE 2000.

[2] Krithi Ramamritham,et al. Efficient transaction support for dynamic information retrieval systems , 1996, SIGIR '96.

[3] Torsten Grabs,et al. A document engine on a db cluster , 1999 .

[4] Ophir Frieder,et al. Integrating structured data and text: a relational approach , 1997 .

[5] Sharad Mehrotra,et al. The Gold Text Indexing Engine , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[6] Patrick Valduriez,et al. Transaction chopping: algorithms and performance studies , 1995, TODS.

[7] Samuel DeFazio. Overview of the Full-Text Document Retrieval Benchmark , 1993, The Benchmark Handbook.

[8] Tom W. Keller,et al. Data placement in Bubba , 1988, SIGMOD '88.

[9] Chaitanya K. Baru,et al. DB2 Parallel Edition , 1995, IBM Syst. J..

[10] Hans-Jörg Schek,et al. A Predicate Oriented Locking Approach for Integrated Information Systems , 1983, IFIP Congress.

[11] Yuri Breitbart,et al. Unifying Concurrency Control and Recovery of Transactions with Semantically Rich Operations , 1998, Theor. Comput. Sci..