Business-driven short-term management of a hybrid IT infrastructure

We consider the problem of managing a hybrid computing infrastructure whose processing elements are comprised of in-house dedicated machines, virtual machines acquired on-demand from a cloud computing provider through short-term reservation contracts, and virtual machines made available by the remote peers of a best-effort peer-to-peer (P2P) grid. Each of these resources has different cost basis and associated quality of service guarantees. The applications that run in this hybrid infrastructure are characterized by a utility function: the utility gained with the completion of an application depends on the time taken to execute it. We take a business-driven approach to manage this infrastructure, aiming at maximizing the profit yielded, that is, the utility produced as a result of the applications that are run minus the cost of the computing resources that are used to run them. We propose a heuristic to be used by a contract planner agent that establishes the contracts with the cloud computing provider to balance the cost of running an application and the utility that is obtained with its execution, with the goal of producing a high overall profit. Our analytical results show that the simple heuristic proposed achieves very high relative efficiency in the use of the hybrid infrastructure. We also demonstrate that the ability to estimate the grid behaviour is an important condition for making contracts that allow such relative efficiency values to be achieved. On the other hand, our simulation results with realistic error predictions show only a modest improvement in the profit achieved by the simple heuristic proposed, when compared to a heuristic that does not consider the grid when planning contracts, but uses it, and another that is completely oblivious to the existence of the grid. This calls for the development of more accurate predictors for the availability of P2P grids, and more elaborated heuristics that can better deal with the several sources of non-determinism present in this hybrid infrastructure.

[1]  Sergio Camorlinga,et al.  Modeling of workflow-engaged networks on radiology transfers across a metro network , 2006, IEEE Transactions on Information Technology in Biomedicine.

[2]  Antonio J. Plaza,et al.  Commodity cluster-based parallel processing of hyperspectral imagery , 2006, J. Parallel Distributed Comput..

[3]  John Wilkes,et al.  Profitable services in an uncertain world , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[4]  Qian Zhu,et al.  Resource Provisioning with Budget Constraints for Adaptive Applications in Cloud Environments , 2010, IEEE Transactions on Services Computing.

[5]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[6]  A. Stephen McGough,et al.  GRIDCC: real-time workflow system , 2007, WORKS '07.

[7]  Michael A. Rappa,et al.  The utility business model and the future of computing services , 2004, IBM Syst. J..

[8]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[9]  Alfons Kemper,et al.  Grid-Based Data Stream Processing in e-Science , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[10]  Shikharesh Majumdar,et al.  A Framework to Achieve Guaranteed QoS for Applications and High System Performance in Multi-Institutional Grid Computing , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[11]  Christopher E. Dabrowski,et al.  Can Economics-based Resource Allocation Prove Effective in a Computation Marketplace? , 2008, Journal of Grid Computing.

[12]  Francisco Vilar Brasileiro,et al.  On the planning of a hybrid IT infrastructure , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[13]  Weigang Li,et al.  Grid Service Agents for Real Time Traffic Synchronization , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).

[14]  Nazareno Andrade,et al.  Labs of the World, Unite!!! , 2006, Journal of Grid Computing.

[15]  Ian Foster,et al.  A quality of service architecture that combines resource reservation and application adaptation , 2000, 2000 Eighth International Workshop on Quality of Service. IWQoS 2000 (Cat. No.00EX400).

[16]  Sanjoy K. Baruah,et al.  On-line scheduling on uniform multiprocessors , 2001, Proceedings 22nd IEEE Real-Time Systems Symposium (RTSS 2001) (Cat. No.01PR1420).

[17]  Walfredo Cirne,et al.  The SegHidro experience: using the grid to empower a hydro-meteorological scientific network , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[18]  Anand Sivasubramaniam,et al.  Scheduling best-effort and real-time pipelined applications on time-shared clusters , 2001, SPAA '01.

[19]  Sara J. Graves,et al.  Towards Dynamically Adaptive Weather Analysis and Forecasting in LEAD , 2005, International Conference on Computational Science.

[20]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[21]  Francisco Vilar Brasileiro,et al.  GridUnit: software testing on the grid , 2006, ICSE.

[22]  Francisco Vilar Brasileiro,et al.  On the efficacy, efficiency and emergent behavior of task replication in large distributed systems , 2007, Parallel Comput..

[23]  David E. Irwin,et al.  Balancing risk and reward in a market-based task service , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[24]  Francisco Vilar Brasileiro,et al.  Predicting the Quality of Service of a Peer-to-Peer Desktop Grid , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[25]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[26]  A. A. Sawchuk,et al.  From remote media immersion to Distributed Immersive Performance , 2003, ETP '03.

[27]  Francisco Vilar Brasileiro,et al.  Bridging the High Performance Computing Gap: the OurGrid Experience , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[28]  Paul Marshall,et al.  Elastic Site: Using Clouds to Elastically Extend Site Resources , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[29]  Qian Zhu,et al.  A resource allocation approach for supporting time-critical applications in grid environments , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[30]  Walfredo Cirne,et al.  Fostering collaboration to better manage water resources: Research Articles , 2007, Grid 2007.

[31]  David E. Culler,et al.  User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[32]  Shikharesh Majumdar,et al.  Engineering grid applications and middleware for high performance , 2007, WOSP '07.

[33]  David Abramson,et al.  Economic models for resource management and scheduling in Grid computing , 2002, Concurr. Comput. Pract. Exp..

[34]  Carla Osthoff,et al.  The BioPAUÁ Project: A Portal for Molecular Dynamics Using Grid Environment , 2005, BSB.

[35]  Lin Yang,et al.  Investigating the use of autonomic cloudbursts for high-throughput medical image registration , 2009, 2009 10th IEEE/ACM International Conference on Grid Computing.

[36]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[37]  Rajkumar Buyya,et al.  Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters , 2009, HPDC '09.

[38]  Walfredo Cirne,et al.  Fostering collaboration to better manage water resources , 2007, Concurr. Comput. Pract. Exp..