Cool and save: Cooling aware dynamic workload scheduling in multi-socket CPU systems

Traditionally CPU workload scheduling and fan control in multi-socket systems have been designed separately leading to less efficient solutions. In this paper we present Cool and Save, a cooling aware dynamic workload management strategy that is significantly more energy efficient than state-of-the art solutions in multi-socket CPU systems because it performs workload scheduling in tandem with controlling socket fan speeds. Our experimental results indicate that applying our scheme gives average fan energy savings of 73% concurrently with reducing the maximum fan speed by 53%, thus leading to lower vibrations and noise levels.

[1]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[2]  Shahin Nazarian,et al.  Thermal Modeling, Analysis, and Management in VLSI Circuits: Principles and Methods , 2006, Proceedings of the IEEE.

[3]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[4]  Alan J. Weger,et al.  Thermal-aware task scheduling at the system software level , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[5]  Kaustav Banerjee,et al.  Modeling and analysis of nonuniform substrate temperature effects on global ULSI interconnects , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  R.H. Lyon,et al.  Noise and cooling in electronics packages , 2004, Twentieth Annual IEEE Semiconductor Thermal Measurement and Management Symposium (IEEE Cat. No.04CH37545).

[7]  Margaret Martonosi,et al.  Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[8]  S. Gupta,et al.  Thermal-aware task scheduling for data centers through minimizing heat recirculation , 2007, 2007 IEEE International Conference on Cluster Computing.

[9]  M.K. Patterson,et al.  The effect of data center temperature on energy efficiency , 2008, 2008 11th Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.

[10]  Kevin Skadron,et al.  Temperature-aware microarchitecture: Modeling and implementation , 2004, TACO.

[11]  C. P. Burger,et al.  Thermal modeling , 1975 .

[12]  Kevin Skadron,et al.  Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[13]  Karsten P. Ulland,et al.  Vii. References , 2022 .

[14]  David J. Sager,et al.  The microarchitecture of the Pentium 4 processor , 2001 .

[15]  Tajana Simunic,et al.  Proactive temperature management in MPSoCs , 2008, Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08).

[16]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[17]  Herming Chiueh,et al.  A novel fully integrated fan controller for advanced computer systems , 2000, 2000 Southwest Symposium on Mixed-Signal Design (Cat. No.00EX390).

[18]  Krste Asanovic,et al.  Reducing power density through activity migration , 2003, ISLPED '03.

[19]  Sani R. Nassif,et al.  Full chip leakage estimation considering power supply and temperature variations , 2003, ISLPED '03.

[20]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.