PerfEnforce: A Dynamic Scaling Engine for Analytics with Performance Guarantees

In this paper, we present PerfEnforce, a scaling engine designed to enable cloud providers to sell performance levels for data analytics cloud services. PerfEnforce scales a cluster of virtual machines allocated to a user in a way that minimizes cost while probabilistically meeting the query runtime guarantees offered by a service level agreement. With PerfEnforce, we show how to scale a cluster in a way that minimally disrupts a user's query session. We further show when to scale the cluster using one of three methods: feedback control, reinforcement learning, or perceptron learning. We find that perceptron learning outperforms the other two methods when making cluster scaling decisions.

[1]  Olga Papaemmanouil Supporting Extensible Performance SLAs for Cloud Databases , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[2]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[3]  Beng Chin Ooi,et al.  Towards elastic transactional cloud storage with range query support , 2010, Proc. VLDB Endow..

[4]  Margaret Martonosi,et al.  Impala: a middleware system for managing autonomic, parallel sensor systems , 2003, PPoPP '03.

[5]  Tilmann Rabl,et al.  A Data Generator for Cloud-Scale Benchmarking , 2010, TPCTC.

[6]  Yun Chi,et al.  SLA-tree: a framework for efficiently supporting SLA-based decisions in cloud computing , 2011, EDBT/ICDT '11.

[7]  Magdalena Balazinska,et al.  PerfEnforce Demonstration: Data Analytics with Performance Guarantees , 2016, SIGMOD Conference.

[8]  Yun Chi,et al.  iCBS: Incremental Costbased Scheduling under Piecewise Linear SLAs , 2011, Proc. VLDB Endow..

[9]  Thomas Seidl,et al.  MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering , 2010, WAPA.

[10]  Jeffrey S. Chase,et al.  Automated control for elastic storage , 2010, ICAC '10.

[11]  Herodotos Herodotou,et al.  No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics , 2011, SoCC.

[12]  Magdalena Balazinska,et al.  Changing the Face of Database Cloud Services with Personalized Service Level Agreements , 2015, CIDR.

[13]  Ioannis Konstantinou,et al.  TIRAMOLA: elastic nosql provisioning through a cloud management platform , 2012, SIGMOD Conference.

[14]  Yun Chi,et al.  CloudOptimizer: multi-tenancy for I/O-bound OLAP workloads , 2013, EDBT '13.

[15]  Tim Kraska,et al.  An evaluation of alternative architectures for transaction processing in the cloud , 2010, SIGMOD Conference.

[16]  Srikanth Kandula,et al.  Jockey: guaranteed job latency in data parallel clusters , 2012, EuroSys '12.

[17]  Yun Chi,et al.  PMAX: tenant placement in multitenant databases for profit maximization , 2013, EDBT '13.

[18]  Michael Stonebraker,et al.  The VoltDB Main Memory DBMS , 2013, IEEE Data Eng. Bull..

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Parijat Dube,et al.  Autoscaling for Hadoop Clusters , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[21]  Divyakant Agrawal,et al.  Characterizing tenant behavior for placement and crisis mitigation in multitenant DBMSs , 2013, SIGMOD '13.

[22]  Divyakant Agrawal,et al.  Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration , 2011, Proc. VLDB Endow..

[23]  Anurag Gupta,et al.  Amazon Redshift and the Case for Simpler Data Warehouses , 2015, SIGMOD Conference.

[24]  Philipp K. Janert Feedback Control for Computer Systems , 2013 .

[25]  Divyakant Agrawal,et al.  Zephyr: live migration in shared nothing databases for elastic cloud platforms , 2011, SIGMOD '11.

[26]  Calton Pu,et al.  ActiveSLA: a profit-oriented admission control framework for database-as-a-service providers , 2011, SoCC.

[27]  Dan Suciu,et al.  Demonstration of the Myria big data management service , 2014, SIGMOD Conference.

[28]  Antony I. T. Rowstron,et al.  Bridging the tenant-provider gap in cloud services , 2012, SoCC '12.

[29]  Rui Liu,et al.  Elastic Scale-Out for Partition-Based Database Systems , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[30]  M. Balazinska,et al.  An analysis of Hadoop usage in scientific workloads , 2013 .