Multi-objective scheduling for real-time data warehouses

AbstractThe issue of write-read contention is one of the most prevalent problems when deploying real-time data warehouses. With increasing load, updates are increasingly delayed and previously fast queries tend to be slowed down considerably. However, depending on the user requirements, we can improve the response time or the data quality by scheduling the queries and updates appropriately. If both criteria are to be considered simultaneously, we are faced with a so-called multi-objective optimization problem. We transformed this problem into a knapsack problem with additional inequalities and solved it efficiently. Based on our solution, we developed a scheduling approach that provides the optimal schedule with regard to the user requirements at any given point in time. We evaluated our scheduling in an extensive experimental study, where we compared our approach with the respective optimal schedule policies of each single optimization objective.

[1]  B Praveen Kumar,et al.  Mariposa a Wide-Area Distributed Database System , 2010, ICCA 2010.

[2]  Panos Vassiliadis,et al.  Towards Quality-oriented Data Warehouse Usage and Evolution , 2000, Inf. Syst..

[3]  Chiara Francalanci,et al.  Data quality assessment from the user's perspective , 2004, IQIS '04.

[4]  B. Kahn,et al.  How Tolerable is Delay? Consumers’ Evaluations of Internet Web Sites after Waiting , 1998 .

[5]  Wolfgang Lehner,et al.  Partition-based workload scheduling in living data warehouse environments , 2007, DOLAP '07.

[6]  Goetz Graefe,et al.  Dynamic resource brokering for multi-user query execution , 1995, SIGMOD '95.

[7]  Sang Hyuk Son,et al.  A QoS-sensitive approach for timeliness and freshness guarantees in real-time databases , 2002, Proceedings 14th Euromicro Conference on Real-Time Systems. Euromicro RTS 2002.

[8]  Jean-Charles Billaut,et al.  Multicriteria scheduling , 2005, Eur. J. Oper. Res..

[9]  R. Nauss The 0–1 knapsack problem with multiple choice constraints☆ , 1978 .

[10]  Evan J. Hughes,et al.  Evolutionary many-objective optimisation: many once or one many? , 2005, 2005 IEEE Congress on Evolutionary Computation.

[11]  Gerhard Weikum,et al.  Towards Guaranteed Quality and Dependability of Information Services , 1999, BTW.

[12]  G. Nemhauser,et al.  Discrete Dynamic Programming and Capital Allocation , 1969 .

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  Alfons Kemper,et al.  Quality of service in an information economy , 2003, TOIT.

[15]  Theodore Johnson,et al.  Real-time transaction scheduling: a cost conscious approach , 1993, SIGMOD Conference.

[16]  Linus Schrage,et al.  The Queue M/G/1 with the Shortest Remaining Processing Time Discipline , 1966, Oper. Res..

[17]  Linus Schrage,et al.  Letter to the Editor - A Proof of the Optimality of the Shortest Remaining Processing Time Discipline , 1968, Oper. Res..

[18]  Torben Bach Pedersen,et al.  RiTE: Providing On-Demand Data for Right-Time Data Warehousing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[19]  Paolo Toth,et al.  Dynamic programming algorithms for the Zero-One Knapsack Problem , 1980, Computing.

[20]  M. Hui,et al.  How Does Waiting Duration Information Influence Customers' Reactions to Waiting for Services?1 , 1996 .

[21]  Joseph Y.-T. Leung,et al.  Handbook of Scheduling: Algorithms, Models, and Performance Analysis , 2004 .

[22]  Panos Vassiliadis,et al.  Towards Quality-oriented Data Warehouse Usage and Evolution , 2000, Inf. Syst..

[23]  A. Sima Etaner-Uyar,et al.  Towards an analysis of dynamic environments , 2005, GECCO '05.

[24]  Wayne E. Smith Various optimizers for single‐stage production , 1956 .

[25]  Miron Livny,et al.  Value-based scheduling in real-time database systems , 1993, The VLDB Journal.

[26]  Sang Hyuk Son,et al.  Managing deadline miss ratio and sensor data freshness in real-time databases , 2004, IEEE Transactions on Knowledge and Data Engineering.

[27]  Harumi A. Kuno,et al.  Dynamic Workload Management for Very Large Data Warehouses: Juggling Feathers and Bowling Balls , 2007, VLDB.

[28]  Amihai Motro,et al.  Estimating the Quality of Data in Relational Databases , 1996, IQ.

[29]  Erich M. Nahum,et al.  Achieving Class-Based QoS for Transactional Workloads , 2006, 22nd International Conference on Data Engineering (ICDE'06).