Dynamic Resource Management in a HPC and Cloud Hybrid Environment

Recently, the large-scale cluster of data center is usually constructed to support both HPC and Cloud computing. It can be explained from two aspects: (1) The data center is typically a sharing environment for all the users, users may submit different types of jobs (HPC and Cloud computing) for processing currently; (2) Some applications can be divided into two parts of subtasks which are suitable to HPC and Cloud computing respectively, e.g. the AMS (Alpha Magnetic Spectrometer) experiment is such a typical application. Thus in order to provide good service for both computing models, it is needed to construct a HPC and Cloud hybrid environment. An existing management mechanism is to allocate fixed proportions of resources for different application environments. However, this approach has a significant performance drawback that is the low resource utilization. In order to overcome this drawback, we propose a dynamic resource management framework and mechanism to satisfy the requirements of both HPC and Cloud computing. Firstly we present a prediction model that is used to predict the arrival rate of all kinds of jobs (HPC types and Cloud types). Based on the prediction results, we propose a dynamic resource allocation algorithm, which manages dynamic resources allocation by using queuing theory. Finally, we evaluate our mechanism by real data sets from AMS experiment and Cloud tasks running on the HPC center in Southeast University. The results show that the proposed mechanism can effectively improve resource utilization at least 30% in this hybrid environment.

[1]  Bernd Freisleben,et al.  On-Demand Resource Provisioning for BPEL Workflows Using Amazon's Elastic Compute Cloud , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[2]  Anthony A. Maciejewski,et al.  Static resource allocation for heterogeneous computing environments with tasks having dependencies, priorities, deadlines, and multiple versions , 2008, J. Parallel Distributed Comput..

[3]  Liang Chen,et al.  A static resource allocation framework for Grid‐based streaming applications , 2006, Concurr. Comput. Pract. Exp..

[4]  Shantenu Jha,et al.  Exploring application and infrastructure adaptation on hybrid grid-cloud infrastructure , 2010, HPDC '10.

[5]  A.H. Ozer,et al.  An auction based mathematical model and heuristics for resource co-allocation problem in grids and clouds , 2009, 2009 Fifth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control.

[6]  M. Prange,et al.  Scientific Computing in the Cloud , 2008, Computing in Science & Engineering.

[7]  Rajkumar Buyya,et al.  Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters , 2009, HPDC '09.

[8]  Shujia Zhou,et al.  Case study for running HPC applications in public clouds , 2010, HPDC '10.

[9]  Andrew Wendelborn,et al.  Remote Interaction and scheduling aspects of cloud based streams , 2009, 2009 5th IEEE International Conference on E-Science Workshops.

[10]  Guangwen Yang,et al.  Load prediction using hybrid model for computational grid , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[11]  Zhiwei Xu,et al.  An Adaptive Scheduling Mechanism for Elastic Grid Computing , 2009, 2009 Fifth International Conference on Semantics, Knowledge and Grid.