Models and Control Strategies for Data Center Energy Efficiency

As the foundation of the nation’s information infrastructure, data centers have been growing rapidly in both number and capacity to meet the increasing demands for highlyresponsive computing and massive storage. Data center energy consumption doubled from 2000 to 2006, reaching a value of 60 TWh/year (Tera Watt hour / year). Coupled with increasing power and cooling demands imposed by the Moore’s law and with the quest for high density data centers, this trend has been rapidly raising the energy cost associated with data centers. Data centers are large cyber-physical systems (CPSs) with hundreds of variables that can be measured and controlled. Dynamics of the controlled processes span multiple time scales: electricity costs can fluctuate hourly, temperatures evolve in the order of minutes, and CPU power states can be changed as frequent as milliseconds. Processes also differ in the spatial areas they influence: computer room air conditioners (CRAC) affect the inlet air temperatures of multiple servers, whereas CPU power states affect only single servers. The large number of constraints and their heterogeneity in nature make data center control a challenging research problem. This dissertation considers data centers as CPSs, with a focus on run-time management and operating costs. The proposed modeling framework explicitly captures the cyberphysical nature of data centers and allows the development of models that represent both the computational and the thermal characteristics of a data center, as well as their interactions. The proposed control strategy attempts to manage both the computational and the thermal characteristics of a data center. The control strategy is based on a hierarchical/distributed control architecture that takes advantage of the modularity typically found in data centers. The hierarchy constitutes of three control levels. The lower levels of the hierarchy deal with fast dynamic processes, while the higher levels deal with the bulk thermal management and the coordination of the controllers at the lower levels. The focus

[1]  Garth A. Gibson,et al.  Scale and Concurrency of GIGA+: File System Directories with Millions of Files , 2011, FAST.

[2]  Zhou Jian-Hui,et al.  Design and Simulation of the CPU Fan and Heat Sinks , 2008, IEEE Transactions on Components and Packaging Technologies.

[3]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[4]  Manish Marwah,et al.  Unified Thermal and Power Management in Server Enclosures , 2009 .

[5]  Manish Marwah,et al.  Optimal Fan Speed Control for Thermal Management of Servers , 2009 .

[6]  Ishfaq Ahmad,et al.  A Pure Nash Equilibrium-Based Game Theoretical Method for Data Replication across Multiple Servers , 2009, IEEE Transactions on Knowledge and Data Engineering.

[7]  P. F. Grimm,et al.  Data center TCO; a comparison of high-density and low-density spaces White Paper , 2007 .

[8]  Suman Nath,et al.  Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services , 2008, NSDI.

[9]  Michael Kaminsky,et al.  FAWNSort : Energy-efficient Sorting of 10 GB , 2010 .

[10]  Mor Harchol-Balter,et al.  Optimal power allocation in server farms , 2009, SIGMETRICS '09.

[11]  Bruno Sinopoli,et al.  A cyber-physical systems approach to energy management in data centers , 2010, ICCPS '10.

[12]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[13]  Amip J. Shah,et al.  Cost Model for Planning, Development and Operation of a Data Center , 2005 .

[14]  T. J. Breen,et al.  From chip to cooling tower data center modeling: Part I Influence of server inlet temperature and temperature rise across cabinet , 2010, 2010 12th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.

[15]  Vanish Talwar,et al.  Power Management of Datacenter Workloads Using Per-Core Power Gating , 2009, IEEE Computer Architecture Letters.

[16]  Xue Liu,et al.  Minimizing Electricity Cost: Optimization of Distributed Internet Data Centers in a Multi-Electricity-Market Environment , 2010, 2010 Proceedings IEEE INFOCOM.

[17]  Amip J. Shah,et al.  From chip to cooling tower data center modeling: Part II Influence of chip temperature control philosophy , 2010, 2010 12th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.

[18]  D. Meisegeier,et al.  POTENTIAL PEAK LOAD REDUCTIONS FROM RESIDENTIAL ENERGY EFFICIENT UPGRADES , 2002 .

[19]  Hisashi Kobayashi,et al.  System Modeling and Analysis: Foundations of System Performance Evaluation , 2008 .

[20]  Bin Fan,et al.  Small cache, big effect: provable load balancing for randomly partitioned cluster services , 2011, SoCC.

[21]  Herwig Bruneel,et al.  Delay and partial system contents for a discrete-time G-D-c queue , 2008, 4OR.

[22]  Bruno Sinopoli,et al.  Dynamic power allocation in server farms: A Real Time Optimization approach , 2010, 49th IEEE Conference on Decision and Control (CDC).

[23]  J. Lang,et al.  A Combined Equivalenced-Electric, Economic, and Market Representation of the Northeastern Power Coordinating Council U.S. Electric Power System , 2008, IEEE Transactions on Power Systems.

[24]  Bruce Jacob,et al.  A control-theoretic approach to dynamic voltage scheduling , 2003, CASES '03.

[25]  C.D. Patel,et al.  Dynamic thermal management of air cooled data centers , 2006, Thermal and Thermomechanical Proceedings 10th Intersociety Conference on Phenomena in Electronics Systems, 2006. ITHERM 2006..

[26]  Marc Moeneclaey,et al.  Calculation of delay characteristics for multiserver queues with constant service times , 2009, Eur. J. Oper. Res..

[27]  Bruno Sinopoli,et al.  A Cyber–Physical Systems Approach to Data Center Modeling and Control for Energy Efficiency , 2012, Proceedings of the IEEE.

[28]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.

[29]  Bruno Sinopoli,et al.  Model Predictive Control of Data Centers in the Smart Grid Scenario , 2011 .

[30]  G. I. Meijer,et al.  Cooling Energy-Hungry Data Centers , 2010, Science.

[31]  Yefu Wang,et al.  Coordinating Power Control and Performance Management for Virtualized Server Clusters , 2011, IEEE Transactions on Parallel and Distributed Systems.

[32]  Dakai Zhu,et al.  Reliability-Aware Energy Management for Periodic Real-Time Tasks , 2009, IEEE Trans. Computers.

[33]  D. Sorensen,et al.  A Survey of Model Reduction Methods for Large-Scale Systems , 2000 .

[34]  Xiaoyun Zhu,et al.  1000 Islands: Integrated Capacity and Workload Management for the Next Generation Data Center , 2008, 2008 International Conference on Autonomic Computing.

[35]  Yuan Chen,et al.  Integrated management of application performance, power and cooling in data centers , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[36]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[37]  Xiaorui Wang,et al.  Server-Level Power Control , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[38]  Jeffrey S. Chase,et al.  Balance of power: dynamic thermal management for Internet data centers , 2005, IEEE Internet Computing.

[39]  Qinghui Tang,et al.  Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters , 2006, 2006 Fourth International Conference on Intelligent Sensing and Information Processing.

[40]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[41]  Daniel M. Batista,et al.  A Survey of Large Scale Data Management Approaches in Cloud Environments , 2011, IEEE Communications Surveys & Tutorials.

[42]  M.J. Ellsworth,et al.  Review of cooling technologies for computer products , 2004, IEEE Transactions on Device and Materials Reliability.

[43]  Bruno Sinopoli,et al.  Reducing data center energy consumption via coordinated cooling and load management , 2008, CLUSTER 2008.

[44]  Chung Choo Chung,et al.  Control Methods in Data-Storage Systems , 2012, IEEE Transactions on Control Systems Technology.

[45]  Pan Zhou,et al.  Research on the Data Storage and Access Model in Distributed Environment , 2009, 2009 International Conference on Computer Engineering and Technology.

[46]  Sandeep K. S. Gupta,et al.  Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach , 2008, IEEE Transactions on Parallel and Distributed Systems.

[47]  Qianchuan Zhao,et al.  Optimal dynamic voltage scaling in power-limited systems with real-time constraints , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[48]  Douglas C. Hittle,et al.  MIMO Robust Control for HVAC Systems , 2008, IEEE Transactions on Control Systems Technology.

[49]  Naehyuck Chang,et al.  Energy-Aware Clock-Frequency Assignment in Microprocessors and Memory Devices for Dynamic Voltage Scaling , 2007, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[50]  Stephen P. Boyd,et al.  Processor Speed Control With Thermal Constraints , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[51]  Diana Marculescu,et al.  Power Management of Voltage/Frequency Island-Based Systems Using Hardware-Based Methods , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[52]  Cullen E. Bash,et al.  Thermal considerations in cooling large scale high compute density data centers , 2002, ITherm 2002. Eighth Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (Cat. No.02CH37258).

[53]  Emanuele Garone,et al.  A hierarchical approach to energy management in data centers , 2010, 49th IEEE Conference on Decision and Control (CDC).

[54]  Chong-Min Kyung,et al.  Program Phase-Aware Dynamic Voltage Scaling Under Variable Computational Workload and Memory Stall Environment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[55]  C. Patel,et al.  Model-Based Approach for Optimizing a Data Center Centralized Cooling System , 2006 .

[56]  Saifur Rahman Power for the Internet , 2001 .

[57]  Kang G. Shin,et al.  Adaptive control of virtualized resources in utility computing environments , 2007, EuroSys '07.

[58]  Van P. Carey,et al.  Exploration of a Potential-Flow-Based Compact Model of Air-Flow Transport in Data Centers , 2009 .

[59]  George Forman,et al.  Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center , 2007, USENIX Annual Technical Conference.

[60]  Bruce M. Maggs,et al.  Cutting the electric bill for internet-scale systems , 2009, SIGCOMM '09.

[61]  Mor Harchol-Balter,et al.  Server farms with setup costs , 2010, Perform. Evaluation.

[62]  James B. Rawlings,et al.  Tutorial overview of model predictive control , 2000 .

[63]  Amar Phanishayee,et al.  FAWNdamentally Power-efficient Clusters , 2009, HotOS.

[64]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.