Dynamic Resource Provisioning for Data Streaming Applications in a Cloud Environment

The recent emergence of, cloud computing is making the, vision, of, utility computing, realizable, i.e., computing resources and services from a cloud can be delivered, utilized, and paid for in the same fashion as utilities like water or electricity. Current, cloud service providers have taken some steps towards supporting the true, pay-as-you-go or a utility-like pricing model, and current research points towards more fine-grained, allocation and pricing of resources in the future., In such environments, resource provisioning becomes a challenging problem, since one needs to avoid both under-provisioning (leading to application slowdown) and over-provisioning (leading to unnecessary resource costs). In this paper, we consider this problem in the context of streaming applications., In these applications, since the data is generated by external sources, the goal is to, carefully allocate resources so that the processing rate can, match the rate of data, arrival. We have developed a solution that can, handle, unexpected data rates, including, the, transient rates., We evaluate our approach using two streaming applications in a virtualized environment.

[1]  Kang G. Shin,et al.  Adaptive control of virtualized resources in utility computing environments , 2007, EuroSys '07.

[2]  Michael Stonebraker,et al.  Fault-tolerance in the Borealis distributed stream processing system , 2005, SIGMOD '05.

[3]  Eugene Ciurana,et al.  Google App Engine , 2009 .

[4]  Kun-Lung Wu,et al.  Elastic scaling of data parallel operators in stream processing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[5]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[6]  Rajeev Motwani,et al.  Operator scheduling in data stream systems , 2004, VLDB 2004.

[7]  Xiaoyun Zhu,et al.  Memory overbooking and dynamic control of Xen virtual machines in consolidated environments , 2009, 2009 IFIP/IEEE International Symposium on Integrated Network Management.

[8]  Michael Stonebraker,et al.  The Aurora and Medusa Projects , 2003, IEEE Data Eng. Bull..

[9]  Navendu Jain,et al.  Design, implementation, and evaluation of the linear road bnchmark on the stream processing core , 2006, SIGMOD Conference.

[10]  Karsten Schwan,et al.  Dynamic Querying of Streaming Data with the dQUOB System , 2003, IEEE Trans. Parallel Distributed Syst..

[11]  Anand Sivasubramaniam,et al.  Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms , 2007, VEE '07.

[12]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[13]  J. R. King,et al.  The Challenge of the Computer Utility , 1967 .

[14]  Steven Hand,et al.  Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters , 2009, ICAC '09.

[15]  Jeffrey S. Chase,et al.  Automated control in cloud computing: challenges and opportunities , 2009, ACDC '09.

[16]  Mohamed Medhat Gaber,et al.  Data Stream Mining Using Granularity-Based Approach , 2009, Foundations of Computational Intelligence.

[17]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.