SLA-based profit optimization for resource management of big data analytics-as-a-service platforms in cloud computing environments

The value that can be extracted from big data greatly motivates organizations to explore data analytics technologies for better decision making and problem solving in a wide range of application domains. Cloud computing greatly eases and benefits big data analytics by offering on-demand and scalable computing infrastructures, platforms, and applications as services. Big data Analytics-as-a-Service (AaaS) platforms aim to deliver data analytics as consumable services in cloud computing environments in a pay as you go model with Service Level Agreement (SLA) guarantees. Resource scheduling for AaaS platforms is significant as big data analytics requires large-scale computing, which can consume huge amounts of resources and incur high resource costs. Our research focuses on proposing automatic and scalable resource scheduling algorithms to maximize the profits for AaaS platforms while delivering AaaS services to users with SLA guarantees on budgets and deadlines to allow timely responses with controllable costs. In this paper, we model and formulate the profit optimization resource scheduling problem and propose an optimization scheduling algorithm that maximizes profits for AaaS platforms and guarantees SLAs for query requests. Experimental evaluations show that the profit optimization scheduling algorithm performs significantly better in cost saving and profit enhancement compared to the state-of-the-art scheduling algorithms.

[1]  Ion Stoica,et al.  BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.

[2]  Gagan Agrawal,et al.  Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[3]  Sven Danø,et al.  Integer Linear Programming , 1974 .

[4]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[5]  Michael Lang,et al.  Optimizing load balancing and data-locality with data-aware scheduling , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[6]  G. Ribiere,et al.  Experiments in mixed-integer linear programming , 1971, Math. Program..

[7]  Vipin Kumar,et al.  Trends in big data analytics , 2014, J. Parallel Distributed Comput..

[8]  R. Benayoun,et al.  Linear programming with multiple objective functions: Step method (stem) , 1971, Math. Program..

[9]  M. Anusha,et al.  Big Data-Survey , 2016 .

[10]  Kamesh Munagala,et al.  Interaction-aware scheduling of report-generation workloads , 2011, The VLDB Journal.

[11]  Florin Pop,et al.  Asymptotic scheduling for many task computing in Big Data platforms , 2015, Inf. Sci..

[12]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[13]  Qiming Chen,et al.  Experience in Continuous analytics as a Service (CaaaS) , 2011, EDBT/ICDT '11.

[14]  Song Guo,et al.  Cost Minimization for Big Data Processing in Geo-Distributed Data Centers , 2014, IEEE Transactions on Emerging Topics in Computing.

[15]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[16]  José Luis Vázquez-Poletti,et al.  Provisioning data analytic workloads in a cloud , 2013, Future Gener. Comput. Syst..

[17]  Bo Gao,et al.  A Cost-Effective Approach to Delivering Analytics as a Service , 2012, 2012 IEEE 19th International Conference on Web Services.

[18]  Shicong Meng,et al.  Bigprovision: a provisioning framework for big data analytics , 2015, IEEE Network.

[19]  H. Isermann Linear lexicographic optimization , 1982 .

[20]  Ming Mao,et al.  A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[21]  Rajkumar Buyya,et al.  SLA-Based Resource Scheduling for Big Data Analytics as a Service in Cloud Computing Environments , 2015, 2015 44th International Conference on Parallel Processing.

[22]  Badrish Chandramouli,et al.  A demonstration of SQLVM: performance isolation in multi-tenant relational database-as-a-service , 2013, SIGMOD '13.

[23]  Weifa Liang,et al.  Collaboration- and Fairness-Aware Big Data Management in Distributed Clouds , 2016, IEEE Transactions on Parallel and Distributed Systems.