Proactive Thermal-Aware Resource Management in Virtualized HPC Cloud Datacenters

Clouds provide the abstraction of nearly-unlimited computing resources through the elastic use of federated resource pools (virtualized datacenters). They are being increasingly considered for HPC applications, which have traditionally targeted grids and supercomputing clusters. However, maximizing energy efficiency and utilization of cloud datacenter resources, avoiding undesired thermal hotspots (due to overheating of over-utilized computing equipment), and ensuring quality of service guarantees for HPC applications are all conflicting objectives, which require joint consideration of multiple pairwise tradeoffs. An innovative proactive thermal-aware virtual machine consolidation (involving allocations as well as migrations) technique is proposed to maximize computing resource utilization, to minimize datacenter energy consumption for computing, and to improve the efficiency of heat extraction. The capability to migrate virtual machines away from lightly-loaded servers in a thermal-aware manner opens up opportunity to improve resource consolidation over time and, hence, achieve the aforementioned goals. The effectiveness of the proposed technique is verified through experimental evaluations with HPC workload traces under single- as well as federated-datacenter scenarios.

[1]  Umesh Bellur,et al.  Resource availability based performance benchmarking of virtual machine migrations , 2013, ICPE '13.

[2]  Jie Liu,et al.  Towards Discovering Data Center Genome Using Sensor Nets , 2008 .

[3]  Dario Pompili,et al.  VMAP: Proactive thermal-aware virtual machine allocation in HPC cloud datacenters , 2012, 2012 19th International Conference on High Performance Computing.

[4]  Dario Pompili,et al.  Self-organizing sensing infrastructure for autonomic management of green datacenters , 2011, IEEE Network.

[5]  Jeffrey Rambo,et al.  Modeling of data center airflow and heat transfer: State of the art and future trends , 2007, Distributed and Parallel Databases.

[6]  Manish Parashar,et al.  Energy-efficient application-aware online provisioning for virtualized clouds and data centers , 2010, International Conference on Green Computing.

[7]  Tao Li,et al.  On Characterization of Performance and Energy Efficiency in Heterogeneous HPC Cloud Data Centers , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[8]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.

[9]  Cullen E. Bash,et al.  Computational Fluid Dynamics Modeling of High Compute Density Data Centers to Assure System Inlet Air Specifications , 2001 .

[10]  Thu D. Nguyen,et al.  Parasol and GreenSwitch: managing datacenters powered by renewable energy , 2013, ASPLOS '13.

[11]  Richard M. Karp,et al.  A probabilistic analysis of multidimensional bin packing problems , 1984, STOC '84.

[12]  Hamid Noori,et al.  Proactive task migration with a self-adjusting migration threshold for dynamic thermal management of multi-core processors , 2014, The Journal of Supercomputing.

[13]  Ricardo Bianchini,et al.  C-Oracle: Predictive thermal management for data centers , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[14]  Reza Azimi,et al.  Thermal-aware layout planning for heterogeneous datacenters , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[15]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[16]  Sandeep K. S. Gupta,et al.  Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach , 2008, IEEE Transactions on Parallel and Distributed Systems.

[17]  Chandrakant D. Patel,et al.  Thermo-Fluids Provisioning of a High Performance High Density Data Center , 2007, Distributed and Parallel Databases.

[18]  Madhusudan K. Iyengar,et al.  Challenges of data center thermal management , 2005, IBM J. Res. Dev..

[19]  S. K. Chang,et al.  A general packing algorithm for multidimensional resource requirements , 1977, International Journal of Computer & Information Sciences.

[20]  Richard E. Brown,et al.  Report to Congress on Server and Data Center Energy Efficiency: Public Law 109-431 , 2008 .

[21]  Renato J. O. Figueiredo,et al.  Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources , 2007, Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing (VTDC '07).

[22]  Dario Pompili,et al.  Proactive thermal management in green datacenters , 2012, The Journal of Supercomputing.

[23]  Jeffrey S. Chase,et al.  Weatherman: Automated, Online and Predictive Thermal Mapping and Management for Data Centers , 2006, 2006 IEEE International Conference on Autonomic Computing.

[24]  Cullen E. Bash,et al.  DIMENSIONLESS PARAMETERS FOR EVALUATION OF THERMAL DESIGN AND PERFORMANCE OF LARGE-SCALE DATA CENTERS , 2002 .

[25]  Krishna C. Saraswat,et al.  Scaling trends for the on chip power dissipation , 2002, Proceedings of the IEEE 2002 International Interconnect Technology Conference (Cat. No.02EX519).

[26]  Ricardo Bianchini,et al.  Mercury and freon: temperature emulation and management for server systems , 2006, ASPLOS XII.

[27]  Steve Greenberg,et al.  Best Practices for Data Centers: Lessons Learned from Benchmarking 22 Data Centers , 2006 .

[28]  Violaine Villebonnet,et al.  Thermal-Aware Cloud Middleware to Reduce Cooling Needs , 2014, 2014 IEEE 23rd International WETICE Conference.

[29]  George Forman,et al.  Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center , 2007, USENIX Annual Technical Conference.

[30]  Ayan Banerjee,et al.  Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers , 2009, Comput. Networks.

[31]  Andrew Chi-Chih Yao,et al.  Resource Constrained Scheduling as Generalized Bin Packing , 1976, J. Comb. Theory A.