A CFD-Based Tool for Studying Temperature in Rack-Mounted Servers

Temperature-aware computing is becoming more important in design of computer systems as power densities are increasing and the implications of high operating temperatures result in higher failure rates of components and increased demand for cooling capability. Computer architects and system software designers need to understand the thermal consequences of their proposals, and develop techniques to lower operating temperatures to reduce both transient and permanent component failures. Recognizing the need for thermal modeling tools to support those researches, there has been work on modeling temperatures of processors at the micro-architectural level which can be easily understood and employed by computer architects for processor designs. However, there is a dearth of such tools in the academic/research community for undertaking architectural/systems studies beyond a processor - a server box, rack or even a machine room. In this paper we presents a detailed 3-dimensional computational fluid dynamics based thermal modeling tool, called ThermoStat, for rack-mounted server systems. We conduct several experiments with this tool to show how different load conditions affect the thermal profile, and also illustrate how this tool can help design dynamic thermal management techniques. We propose reactive and proactive thermal management for rack mounted server and isothermal workload distribution for rack.

[1]  Jianwei Chen,et al.  Integrating complete-system and user-level performance/power simulators: the SimWattch approach , 2003, 2003 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS 2003..

[2]  Joonwon Lee,et al.  Modeling and Managing Thermal Profiles of Rack-mounted Servers with ThermoStat , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[3]  Frank Bellosa,et al.  Event-Driven Energy Accounting for Dynamic Thermal Management , 2002 .

[4]  Yogendra Joshi,et al.  Use of Computational Fluid Dynamics in the Design and Optimization of Microchannel Heat Exchangers for Microelectronics Cooling , 2005 .

[5]  Jelena Srebric,et al.  An example of verification, validation, and reporting of indoor environment CFD analyses (RP-1133) , 2002 .

[6]  Dereje Agonafer,et al.  A Hybrid Methodology for the Optimization of Data Center Room Layout , 2003 .

[7]  Cullen E. Bash,et al.  Computational Fluid Dynamics Modeling of High Compute Density Data Centers to Assure System Inlet Air Specifications , 2001 .

[8]  Cullen E. Bash,et al.  Smart cooling of data centers , 2003 .

[9]  Mahmut T. Kandemir,et al.  DRPM: dynamic speed control for power management in server class disks , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[10]  Sarita V. Adve,et al.  Predictive dynamic thermal management for multimedia applications , 2003, ICS '03.

[11]  Michael S. Miller,et al.  BladeCenter packaging, power, and cooling , 2005, IBM J. Res. Dev..

[12]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[13]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[14]  Lian-Tuu Yeh,et al.  Thermal Management of Microelectronic Equipment , 2002 .

[15]  Jeffrey S. Chase,et al.  Balance of power: dynamic thermal management for Internet data centers , 2005, IEEE Internet Computing.

[16]  N. VijaykumarT.,et al.  Heat-and-run , 2004 .

[17]  J. Chase,et al.  Going beyond CPUs: The potential of Temperature-Aware Solutions for the Data Center , 2004 .

[18]  Anand Sivasubramaniam,et al.  Understanding the performance-temperature interactions in disk I/O of server workloads , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[19]  C. Bailey,et al.  Investigation into the performance of turbulence models for fluid flow and heat transfer phenomena in electronic applications , 2004, Twentieth Annual IEEE Semiconductor Thermal Measurement and Management Symposium (IEEE Cat. No.04CH37545).

[20]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.

[21]  Bahram Moshfegh,et al.  Investigation of indoor climate and power usage in a data center , 2005 .

[22]  Ricardo Bianchini,et al.  Mercury and freon: temperature emulation and management for server systems , 2006, ASPLOS XII.

[23]  Dereje Agonafer,et al.  LVEL turbulence model for conjugate heat transfer at low Reynolds numbers , 1996 .

[24]  Kevin Skadron,et al.  Performance, energy, and thermal considerations for SMT and CMP architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.

[25]  Margaret Martonosi,et al.  Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[26]  Yuanyuan Zhou,et al.  Reducing Energy Consumption of Disk Storage Using Power-Aware Cache Management , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[27]  S. Charap,et al.  Thermal stability of recorded information at high densities , 1996 .

[28]  Cullen E. Bash,et al.  DIMENSIONLESS PARAMETERS FOR EVALUATION OF THERMAL DESIGN AND PERFORMANCE OF LARGE-SCALE DATA CENTERS , 2002 .

[29]  Stephen H. Gunther,et al.  Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .

[30]  Ricardo Bianchini,et al.  Power and energy management for server systems , 2004, Computer.

[31]  R.F. Huang,et al.  Thermal design of a disk-array system , 2002, ITherm 2002. Eighth Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (Cat. No.02CH37258).

[32]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[33]  Chandrakant D. Patel,et al.  Thermo-Fluids Provisioning of a High Performance High Density Data Center , 2007, Distributed and Parallel Databases.

[34]  Anand Sivasubramaniam,et al.  Disk drive roadmap from the thermal perspective: a case for dynamic thermal management , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[35]  C.E. Bash,et al.  Smart chip, system and data center enabled by advanced flexible cooling resources , 2005, Semiconductor Thermal Measurement and Management IEEE Twenty First Annual IEEE Symposium, 2005..

[36]  Carla E. Brodley,et al.  Heat stroke: power-density-based denial of service in SMT , 2005, 11th International Symposium on High-Performance Computer Architecture.

[37]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[38]  Li Shang,et al.  Thermal Modeling, Characterization and Management of On-Chip Networks , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[39]  Cullen E. Bash,et al.  Thermal considerations in cooling large scale high compute density data centers , 2002, ITherm 2002. Eighth Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (Cat. No.02CH37258).

[40]  E. John,et al.  Cache leakage power analysis in embedded applications , 2004, The 2004 47th Midwest Symposium on Circuits and Systems, 2004. MWSCAS '04..

[41]  Amip J. Shah,et al.  Cost Model for Planning, Development and Operation of a Data Center , 2005 .

[42]  K.K. Dhinsa,et al.  Turbulence modelling and it's impact on CFD predictions for cooling of electronic components , 2004, The Ninth Intersociety Conference on Thermal and Thermomechanical Phenomena In Electronic Systems (IEEE Cat. No.04CH37543).

[43]  E. Cohen,et al.  Hotspot-Limited Microprocessors: Direct Temperature and Power Distribution Measurements , 2007, IEEE Journal of Solid-State Circuits.

[44]  Frank Bellosa,et al.  Dynamic Thermal Management for Distributed Systems , 2002 .

[45]  Peter Rodgers,et al.  Prediction of Microelectronics Thermal Behavior in Electronic Equipment: Status, Challenges and Future Requirements , 2004 .

[46]  Cullen E. Bash,et al.  Efficient Thermal Management of Data Centers—Immediate and Long-Term Research Needs , 2003 .

[47]  C.-L. Chen,et al.  Numerical optimization of a power electronics cooling assembly , 2001, APEC 2001. Sixteenth Annual IEEE Applied Power Electronics Conference and Exposition (Cat. No.01CH37181).