A Self-tuning Framework for Cloud Storage Clusters

The well-known problems of tuning and self-tuning of data management systems are amplified in the context of Cloud environments that promise self management along with properties like elasticity and scalability. The intricate criteria of Cloud storage systems such as their modular, distributed, and multi-layered architecture add to the complexity of the tuning and self-tuning process. In this paper, we provide an architecture for a self-tuning framework for Cloud data storage clusters. The framework consists of components to observe and model certain performance criteria and a decision model to adjust tuning parameters according to specified requirements. As part of its implementation, we provide an overview on benchmarking and performance modeling components along with experimental results.

[1]  Jordi Torres,et al.  Resource-Aware Adaptive Scheduling for MapReduce Clusters , 2011, Middleware.

[2]  Carlo Curino,et al.  Benchmarking OLTP/web databases in the cloud: the OLTP-bench framework , 2012, CloudDB '12.

[3]  Tilmann Rabl,et al.  Solving Big Data Challenges for Enterprise Application Performance Management , 2012, Proc. VLDB Endow..

[4]  Jun Yan,et al.  Computing Resource Prediction for MapReduce Applications Using Decision Tree , 2012, APWeb.

[5]  Haiming Zhang,et al.  Benchmarking Replication and Consistency Strategies in Cloud Serving Databases: HBase and Cassandra , 2014, BPOE@ASPLOS/VLDB.

[6]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[7]  José A. B. Fortes,et al.  On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[8]  Eike Schallehn,et al.  Cloud Data Management: A Short Overview and Comparison of Current Approaches , 2012, Grundlagen von Datenbanken.

[9]  Archana Ganapathi,et al.  Statistics-driven workload modeling for the Cloud , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[10]  Carsten Binnig,et al.  How is the weather tomorrow?: towards a benchmark for the cloud , 2009, DBTest '09.

[11]  Jing Zhao,et al.  Benchmarking cloud-based data management systems , 2010, CloudDB '10.

[12]  Jordi Torres,et al.  GreenHadoop: leveraging green energy in data-processing frameworks , 2012, EuroSys '12.

[13]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[14]  Yun Chi,et al.  iCBS: Incremental Costbased Scheduling under Piecewise Linear SLAs , 2011, Proc. VLDB Endow..

[15]  Samuel Kounev,et al.  Predictive performance modeling of virtualized storage systems using optimized statistical regression techniques , 2013, ICPE '13.

[16]  Rolf Stadler,et al.  Predicting response times for the Spotify backend , 2012, 2012 8th international conference on network and service management (cnsm) and 2012 workshop on systems virtualiztion management (svm).

[17]  Shivnath Babu,et al.  How to Fit when No One Size Fits , 2013, CIDR.

[18]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[19]  Peter G. Harrison,et al.  Understanding, modelling, and improving the performance of web applications in multicore virtualised environments , 2014, ICPE.

[20]  Kaushik Dutta,et al.  Application performance modeling in a virtualized environment , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[21]  Mark Kotanchek,et al.  Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models , 2008 .

[22]  Peter Kilpatrick,et al.  Performance models of storage contention in cloud environments , 2013, Software & Systems Modeling.

[23]  Ashraf Aboulnaga,et al.  Deploying Database Appliances in the Cloud , 2009, IEEE Data Eng. Bull..

[24]  Calton Pu,et al.  Intelligent management of virtualized resources for database systems in cloud environment , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[25]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.