Partition-based workload scheduling in living data warehouse environments

The demand for so-called living or real-time data warehouses is increasing in many application areas such as manufacturing, event monitoring and telecommunications. In these fields users usually expect short response times for their queries and high freshness for the requested data. However, meeting these fundamental requirements is challenging due to the high loads and the continuous flow of write-only updates and read-only queries, which may be in conflict with each other. Therefore, we present the concept of Workload Balancing by Election (WINE), which allows users to express their individual demands on the Quality of Service and the Quality of Data respectively. WINE applies this information to balance and prioritize over both types of transactions -- queries and update -- according to the varying user needs. A simulation study shows that our proposed algorithm outperforms competitor baseline algorithms over the entire spectrum of workloads and user requirements.

[1]  Jonathan Goldstein,et al.  MTCache: transparent mid-tier database caching in SQL server , 2004, Proceedings. 20th International Conference on Data Engineering.

[2]  Sang Hyuk Son,et al.  Managing deadline miss ratio and sensor data freshness in real-time databases , 2004, IEEE Transactions on Knowledge and Data Engineering.

[3]  Wolfgang Lehner,et al.  Partition-based workload scheduling in living data warehouse environments , 2009, Inf. Syst..

[4]  Miron Livny,et al.  Value-based scheduling in real-time database systems , 1993, The VLDB Journal.

[5]  Michael Stonebraker,et al.  An economic paradigm for query processing and data migration in Mariposa , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[6]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[7]  Donald F. Ferguson,et al.  Microeconomic algorithms for load balancing in distributed computer systems , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[8]  William E. Weihl,et al.  Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.

[9]  Alexandros Labrinidis,et al.  Preference-Aware Query and Update Scheduling in Web-databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Wolfgang Lehner,et al.  Optimistic Coarse-Grained Cache Semantics for Data Marts , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[11]  Yechiam Yemini,et al.  A Microeconomic Approach to Decentralized Optimization of Channel Access Policies in Multiaccess Networks , 1985, IEEE International Conference on Distributed Computing Systems.

[12]  Jonathan Goldstein,et al.  Relaxed currency and consistency: how to say "good enough" in SQL , 2004, SIGMOD '04.

[13]  Sang Hyuk Son,et al.  A QoS-sensitive approach for timeliness and freshness guarantees in real-time databases , 2002, Proceedings 14th Euromicro Conference on Real-Time Systems. Euromicro RTS 2002.

[14]  Theodore Johnson,et al.  Real-time transaction scheduling: a cost conscious approach , 1993, SIGMOD Conference.

[15]  Daniel Mossé,et al.  UNIT: User-centric Transaction Management in Web-Database Systems , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[16]  Raghunath Othayoth Nambiar,et al.  The making of TPC-DS , 2006, VLDB.

[17]  Donald F. Ferguson,et al.  Economic models for allocating resources in computer systems , 1996 .