Towards Cost-Effective Storage Provisioning for DBMSs

Data center operators face a bewildering set of choices when considering how to provision resources on machines with complex I/O subsystems. Modern I/O subsystems often have a rich mix of fast, high performing, but expensive SSDs sitting alongside with cheaper but relatively slower (for random accesses) traditional hard disk drives. The data center operators need to determine how to provision the I/O resources for specific workloads so as to abide by existing Service Level Agreements (SLAs), while minimizing the total operating cost (TOC) of running the workload, where the TOC includes the amortized hardware costs and the run time energy costs. The focus of this paper is on introducing this new problem of TOC-based storage allocation, cast in a framework that is compatible with traditional DBMS query optimization and query processing architecture. We also present a heuristic-based solution to this problem, called DOT. We have implemented DOT in PostgreSQL, and experiments using TPC-H and TPC-C demonstrate significant TOC reduction by DOT in various settings.

[1]  Goetz Graefe,et al.  Fast scans and joins using flash drives , 2008, DaMoN '08.

[2]  Kenneth Salem,et al.  Workload-aware storage layout for database systems , 2010, SIGMOD Conference.

[3]  Milo Polte,et al.  Enabling Enterprise Solid State Disks Performance , 2009 .

[4]  Bu-Sung Lee,et al.  Optimal virtual machine placement across multiple cloud providers , 2009, 2009 IEEE Asia-Pacific Services Computing Conference (APSCC).

[5]  Jae-Myung Kim,et al.  A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.

[6]  Kenneth A. Ross,et al.  Modeling the performance of algorithms on flash memory devices , 2008, DaMoN '08.

[7]  Brian J. Watson,et al.  Autonomic Virtual Machine Placement in the Data Center , 2008 .

[8]  Andrzej Kochut,et al.  Dynamic Placement of Virtual Machines for Managing SLA Violations , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[9]  Surajit Chaudhuri,et al.  Automatic physical database tuning: a relaxation-based approach , 2005, SIGMOD '05.

[10]  Yun Chi,et al.  iCBS: Incremental Costbased Scheduling under Piecewise Linear SLAs , 2011, Proc. VLDB Endow..

[11]  Stratis Viglas,et al.  Flashing up the storage layer , 2008, Proc. VLDB Endow..

[12]  Kenneth A. Ross,et al.  An Object Placement Advisor for DB2 Using Solid State Storage , 2009, Proc. VLDB Endow..

[13]  Ashraf Aboulnaga,et al.  Automatic virtual machine configuration for database workloads , 2008, SIGMOD Conference.

[14]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[15]  Surajit Chaudhuri,et al.  Table of Contents (pdf) , 2007, VLDB.

[16]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[17]  Calton Pu,et al.  Intelligent management of virtualized resources for database systems in cloud environment , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[18]  Goetz Graefe,et al.  The five-minute rule twenty years later, and how flash memory changes the rules , 2007, DaMoN '07.

[19]  Surajit Chaudhuri,et al.  An Online Approach to Physical Design Tuning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[20]  Ramesh K. Sitaraman,et al.  Lazy-Adaptive Tree: An Optimized Index Structure for Flash Devices , 2009, Proc. VLDB Endow..

[21]  Goetz Graefe,et al.  Query processing techniques for solid state drives , 2009, SIGMOD Conference.

[22]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[23]  Kenneth A. Ross,et al.  SSD bufferpool extensions for database systems , 2010, Proc. VLDB Endow..

[24]  Vivek R. Narasayya,et al.  Automatic physical design tuning: workload as a sequence , 2006, SIGMOD Conference.