A generic auto-provisioning framework for cloud databases

We discuss the problem of resource provisioning for database management systems operating on top of an Infrastructure-As-A-Service (IaaS) cloud. To solve this problem, we describe an extensible framework that, given a target query workload, continually optimizes the system's operational cost, estimated based on the IaaS provider's pricing model, while satisfying QoS expectations. Specifically, we describe two different approaches, a “white-box” approach that uses a fine-grained estimation of the expected resource consumption for a workload, and a “black-box” approach that relies on coarse-grained profiling to characterize the workload's end-to-end performance across various cloud resources. We formalize both approaches as a constraint programming problem and use a generic constraint solver to efficiently tackle them. We present preliminary experimental numbers, obtained by running TPC-H queries with PostsgreSQL on Amazon's EC2, that provide evidence of the feasibility and utility of our approaches. We also briefly discuss the pertinent challenges and directions of on-going research.

[1]  Tim Kraska,et al.  Building a database on S3 , 2008, SIGMOD Conference.

[2]  Archana Ganapathi,et al.  Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[3]  Norman W. Paton,et al.  Optimizing Utility in Cloud Computing through Autonomic Workload Execution , 2009 .

[4]  Daniel J. Abadi,et al.  Data Management in the Cloud: Limitations and Opportunities , 2009, IEEE Data Eng. Bull..

[5]  Sam Lightstone,et al.  Adaptive self-tuning memory in DB2 , 2006, VLDB.

[6]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  UrgaonkarBhuvan,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002 .

[8]  Kamesh Munagala,et al.  Modeling and exploiting query interactions in database systems , 2008, CIKM '08.

[9]  Prashant J. Shenoy,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002, OSDI '02.

[10]  Ashraf Aboulnaga,et al.  Automatic virtual machine configuration for database workloads , 2008, SIGMOD Conference.

[11]  Ashraf Aboulnaga,et al.  Deploying Database Appliances in the Cloud , 2009, IEEE Data Eng. Bull..

[12]  David E. Irwin,et al.  Automated and on-demand provisioning of virtual machines for database applications , 2007, SIGMOD '07.

[13]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[14]  Gerhard Weikum,et al.  Self-tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering , 2002, VLDB.

[15]  Wei Jin,et al.  USENIX Association Proceedings of USITS ’ 03 : 4 th USENIX Symposium on Internet Technologies and Systems , 2003 .

[16]  Anastasia Ailamaki,et al.  Continuous resource monitoring for self-predicting DBMS , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[17]  Daniela Florescu,et al.  Rethinking cost and performance of database systems , 2009, SGMD.

[18]  Gustavo Alonso,et al.  Consistency Rationing in the Cloud: Pay only when it matters , 2009, Proc. VLDB Endow..