Proactive thermal-aware management in cloud datacenters

OF THE DISSERTATION Proactive Thermal-aware Management in Cloud Datacenters by Eun Kyung Lee Dissertation Director: Professor Dario Pompili The complexity of modern datacenters is growing at an alarming rate due to the rising popularity of the cloud-computing paradigm as an effective means to cater to the ever increasing demand for computing and storage. The management of modern datacenters is rapidly exceeding human ability, making autonomic approaches essential. In the meanwhile, the increasing demand for faster computing and high storage capacity has resulted in an increase in energy consumption and heat generation in datacenters. Due to the increased heat generation, cooling requirements have become a critical concern, both in terms of growing operating costs as well as their environmental and societal impacts. (e.g., increase in CO2 emissions, overloading the electric supply grid resulting in power cuts, heavy water usage for cooling systems causing water scarcity) In this thesis, proactive thermal-aware datacenter management solutions, which include thermaland energy-aware resource provisioning, cooling system optimization, and anomaly detection, are proposed to help minimize both the impact on the environment and the Total Cost of Ownership (TCO) of datacenters, making them energy efficient and green. For the proactive thermal-aware solutions, a novel architecture endowed with different abstract components is introduced, which is composed of four

[1]  Umesh Bellur,et al.  Resource availability based performance benchmarking of virtual machine migrations , 2013, ICPE '13.

[2]  Roger R. Schmidt,et al.  Cluster of High Powered Racks Within a Raised Floor Computer Data Center: Effect of Perforated Tile Flow Distribution on Rack Inlet Air Temperatures , 2003 .

[3]  Eric Bouillet,et al.  Efficient resource provisioning in compute clouds via VM multiplexing , 2010, ICAC '10.

[4]  Albert Y. Zomaya,et al.  Cooperative power-aware scheduling in grid computing environments , 2010, J. Parallel Distributed Comput..

[5]  Wu-chun Feng,et al.  Towards efficient supercomputing: a quest for the right metric , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[6]  Jeffrey S. Chase,et al.  Weatherman: Automated, Online and Predictive Thermal Mapping and Management for Data Centers , 2006, 2006 IEEE International Conference on Autonomic Computing.

[7]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[8]  I. E. Idelchik,et al.  Flow Resistance : A Design Guide for Engineers , 1989 .

[9]  John P. Kerekes,et al.  Receiver Operating Characteristic Curve Confidence Intervals and Regions , 2008, IEEE Geoscience and Remote Sensing Letters.

[10]  Rongliang Zhou,et al.  Failure Resistant Data Center Cooling Control Through Model-Based Thermal Zone Mapping , 2012 .

[11]  Manish Parashar,et al.  Energy-efficient application-aware online provisioning for virtualized clouds and data centers , 2010, International Conference on Green Computing.

[12]  Kishor S. Trivedi,et al.  A comparative experimental study of software rejuvenation overhead , 2013, Perform. Evaluation.

[13]  Richard M. Karp,et al.  A probabilistic analysis of multidimensional bin packing problems , 1984, STOC '84.

[14]  Roger R. Schmidt MEASUREMENTS AND PREDICTIONS OF THE FLOW DISTRIBUTION THROUGH PERFORATED TILES IN RAISED-FLOOR DATA CENTERS , 2001 .

[15]  Frank Bellosa,et al.  Energy Management for Hypervisor-Based Virtual Machines , 2007, USENIX Annual Technical Conference.

[16]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[17]  Dario Pompili,et al.  SILENCE: distributed adaptive sampling for sensor-based autonomic systems , 2011, ICAC '11.

[18]  Hui Wang,et al.  An Adaptive Resource Flowing Scheme amongst VMs in a VM-Based Utility Computing , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[19]  Dario Pompili,et al.  Self-organizing sensing infrastructure for autonomic management of green datacenters , 2011, IEEE Network.

[20]  Ishfaq Ahmad,et al.  A Cooperative Game Theoretical Technique for Joint Optimization of Energy Consumption and Response Time in Computational Grids , 2009, IEEE Transactions on Parallel and Distributed Systems.

[21]  George Forman,et al.  Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center , 2007, USENIX Annual Technical Conference.

[22]  Patrick Martin,et al.  IDSaaS: Intrusion Detection System as a Service in Public Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[23]  Manish Marwah,et al.  Thermal anomaly prediction in data centers , 2010, 2010 12th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.

[24]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[25]  George Varghese,et al.  Difference engine , 2010, OSDI.

[26]  Rajkumar Buyya,et al.  Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation , 2009, CloudCom.

[27]  Jacob D. Furst,et al.  CO-OCCURRENCE MATRICES FOR VOLUMETRIC DATA , 2004 .

[28]  Roberto Bifulco,et al.  Integrating a network IDS into an open source Cloud Computing environment , 2010, 2010 Sixth International Conference on Information Assurance and Security.

[29]  Rajarshi Das,et al.  Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[30]  Masaki Nakao,et al.  Air flow systems for telecommunications equipment rooms , 1989, Conference Proceedings., Eleventh International Telecommunications Energy Conference.

[31]  M. Nakao,et al.  Airflow distribution in telecommunications equipment rooms , 1990, 12th International Conference on Telecommunications Energy.

[32]  Krishna Kant,et al.  Data center evolution: A tutorial on state of the art, issues, and challenges , 2009, Comput. Networks.

[33]  Ricardo Bianchini,et al.  C-Oracle: Predictive thermal management for data centers , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[34]  Renato J. O. Figueiredo,et al.  Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources , 2007, Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing (VTDC '07).

[35]  Sandeep K. S. Gupta,et al.  Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach , 2008, IEEE Transactions on Parallel and Distributed Systems.

[36]  Shen Li,et al.  Joint Optimization of Computing and Cooling Energy: Analytic Model and a Machine Room Case Study , 2012, 2012 IEEE 32nd International Conference on Distributed Computing Systems.

[37]  Dario Pompili,et al.  Proactive thermal management in green datacenters , 2012, The Journal of Supercomputing.

[38]  Claudio Scordino,et al.  Energy-Efficient Real-Time Heterogeneous Server Clusters , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[39]  S. K. Chang,et al.  A general packing algorithm for multidimensional resource requirements , 1977, International Journal of Computer & Information Sciences.

[40]  Ayan Banerjee,et al.  Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers , 2009, Comput. Networks.

[41]  Vanish Talwar,et al.  vManage: loosely coupled platform and virtualization management in data centers , 2009, ICAC '09.

[42]  Jun Wang,et al.  A survey on energy-efficient data management , 2011, SGMD.

[43]  Gail E. Kaiser,et al.  Multi-perspective evaluation of self-healing systems using simple probabilistic models , 2009, ICAC '09.

[44]  Xi He,et al.  Power-aware scheduling of virtual machines in DVFS-enabled clusters , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[45]  Andrzej Kochut,et al.  On Strategies for Dynamic Resource Management in Virtualized Server Environments , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[46]  Qian Zhu,et al.  Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[47]  Xavier Lorca,et al.  Entropy: a consolidation manager for clusters , 2009, VEE '09.

[48]  Roger R. Schmidt,et al.  A methodology for the design of perforated tiles in raised floor data centers using computational flow analysis , 2001 .

[49]  Cullen E. Bash,et al.  DIMENSIONLESS PARAMETERS FOR EVALUATION OF THERMAL DESIGN AND PERFORMANCE OF LARGE-SCALE DATA CENTERS , 2002 .

[50]  Andrzej Kochut,et al.  Dynamic Placement of Virtual Machines for Managing SLA Violations , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[51]  Jeffrey Rambo,et al.  Modeling of data center airflow and heat transfer: State of the art and future trends , 2007, Distributed and Parallel Databases.

[52]  Andrew Chi-Chih Yao,et al.  Resource Constrained Scheduling as Generalized Bin Packing , 1976, J. Comb. Theory A.

[53]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.

[54]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[55]  Krishna C. Saraswat,et al.  Scaling trends for the on chip power dissipation , 2002, Proceedings of the IEEE 2002 International Interconnect Technology Conference (Cat. No.02EX519).

[56]  Hong Zhu,et al.  A survey of practical algorithms for suffix tree construction in external memory , 2010 .

[57]  Masaki Nakao,et al.  Which cooling air supply system is better for a high heat density room: underfloor or overhead? , 1991, [Proceedings] Thirteenth International Telecommunications Energy Conference - INTELEC 91.

[58]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW.

[59]  Madhusudan K. Iyengar,et al.  Challenges of data center thermal management , 2005, IBM J. Res. Dev..

[60]  Daniel A. Menascé,et al.  Autonomic Virtualized Environments , 2006, International Conference on Autonomic and Autonomous Systems (ICAS'06).

[61]  R. Schmidt,et al.  Raised-floor data center: perforated tile flow rates for various tile layouts , 2004, The Ninth Intersociety Conference on Thermal and Thermomechanical Phenomena In Electronic Systems (IEEE Cat. No.04CH37543).

[62]  Jie Liu,et al.  Towards Discovering Data Center Genome Using Sensor Nets , 2008 .

[63]  Dario Pompili,et al.  VMAP: Proactive thermal-aware virtual machine allocation in HPC cloud datacenters , 2012, 2012 19th International Conference on High Performance Computing.

[64]  Dario Pompili,et al.  Management in Instrumented Datacenters , 2010 .

[65]  Cullen E. Bash,et al.  Computational Fluid Dynamics Modeling of High Compute Density Data Centers to Assure System Inlet Air Specifications , 2001 .

[66]  Peter Desnoyers,et al.  Memory buddies: exploiting page sharing for smart colocation in virtualized data centers , 2009, VEE '09.

[67]  Karsten Schwan,et al.  VirtualPower: coordinated power management in virtualized enterprise systems , 2007, SOSP.

[68]  Anand Sivasubramaniam,et al.  Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms , 2007, VEE '07.

[69]  Ripal Nathuji,et al.  Exploiting Platform Heterogeneity for Power Efficient Data Centers , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[70]  Qinghui Tang,et al.  Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters , 2006, 2006 Fourth International Conference on Intelligent Sensing and Information Processing.

[71]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[72]  Rong Ge,et al.  High-performance, power-aware distributed computing for scientific applications , 2005, Computer.

[73]  Ricardo Bianchini,et al.  Mercury and freon: temperature emulation and management for server systems , 2006, ASPLOS XII.

[74]  Steve Greenberg,et al.  Best Practices for Data Centers: Lessons Learned from Benchmarking 22 Data Centers , 2006 .