Hybrid surrogate model for online temperature and pressure predictions in data centers

Abstract The increase in cloud computing and big data storage has led to significant growth in data center (DC) infrastructure that is now estimated to consume more than 1.5% of the world’s electricity. Due to suboptimal DC design and operation, a significant fraction of this energy is wasted because of the cooling systems inability to effectively distribute cold air to servers. Consequently, additional cooling air must be circulated inside a DC to prevent local hot spots, which leads to undercooling at other locations. Row-based cooling is an emerging architecture that provides more effective airflow distribution, which lowers energy consumption. Since available methods are unsuitable for accurate online predictions, a general thermal model is required to predict spatiotemporal temperature changes inside a DC and hence optimize airflow distribution for this architecture. Typical approaches include physical models, computational fluid dynamics (CFD) simulations, and black-box data-driven models (DDMs). All three approaches are limited because they do not encapsulate the entirety of relevant operational parameters, are time-consuming and can provide unacceptable errors during extrapolative predictions. We address these deficiencies by developing a fast, adaptive, and accurate hybrid surrogate model by combining a DDM and the thermofluid transport relations to predict temperatures in a DC. Training data for the DDM is obtained from CFD simulations. An artificial neural network (ANN) with the Rectified Linear Unit (ReLU) activation function is shown to predict pressure distributions accurately in a row-based cooling DC. These predicted pressures are inputs for thermofluid transport equations to determine the temperature distribution. The applicability of the model is demonstrated by comparing predictions with experimental measurements that characterize the influence of varying server workload distribution and cooling unit operational conditions, i.e., temperature set-point, airflow rate, and fan locations, on the temperature distribution. The model can be used to (1) improve cooling configuration design, (2) facilitate thermally aware workload management, and (3) test “what if” scenarios to characterize the influence of operating conditions on the temperature distribution.

[1]  Xiao Qin,et al.  Thermal benchmarking and modeling for HPC using big data applications , 2018, Future Gener. Comput. Syst..

[2]  R. Temam,et al.  Navier-Stokes equations: theory and numerical analysis: R. Teman North-Holland, Amsterdam and New York. 1977. 454 pp. US $45.00 , 1978 .

[3]  Jinkyun Cho,et al.  Evaluation of air management system's thermal performance for superior cooling efficiency in high-de , 2011 .

[4]  Jean-Marc Pierson,et al.  Spatio-temporal thermal-aware scheduling for homogeneous high-performance computing datacenters , 2017, Future Gener. Comput. Syst..

[5]  Christof Vömel,et al.  Neural Network-Based Prediction and Control of Air Flow in a Data Center , 2012 .

[6]  Jeffrey S. Chase,et al.  Weatherman: Automated, Online and Predictive Thermal Mapping and Management for Data Centers , 2006, 2006 IEEE International Conference on Autonomic Computing.

[7]  Bahgat Sammakia,et al.  A dynamic compact thermal model for data center analysis and control using the zonal method and artificial neural networks , 2014 .

[8]  Avi Ostfeld,et al.  Data-driven modelling: some past experiences and new approaches , 2008 .

[9]  Mohammad. Rasul,et al.  Temperature monitoring and CFD Analysis of Data Centre , 2013 .

[10]  S. Patankar Airflow and Cooling in a Data Center , 2010 .

[11]  Atul Bhargav,et al.  Advances in data center thermal management , 2015 .

[12]  Thomas Brunschwiler,et al.  Toward zero-emission data centers through direct reuse of thermal energy , 2009, IBM J. Res. Dev..

[13]  José Manuel Moya,et al.  Runtime data center temperature prediction using Grammatical Evolution techniques , 2016, Appl. Soft Comput..

[14]  Junaid Shuja,et al.  A Systems Overview of Commercial Data Centers: Initial Energy and Cost Analysis , 2019, Int. J. Inf. Technol. Web Eng..

[15]  Bahgat Sammakia,et al.  Data Center Cooling Prediction Using Artificial Neural Network , 2007 .

[16]  Christian Inard,et al.  An Equation-Based Simulation Environment to Investigate Fast Building Simulation , 2006 .

[17]  Guoliang Xing,et al.  A High-Fidelity Temperature Distribution Forecasting System for Data Centers , 2012, 2012 IEEE 33rd Real-Time Systems Symposium.

[18]  S. A. Nada,et al.  Numerical investigation and parametric study for thermal and energy management enhancements in data centers' buildings , 2016 .

[19]  Joonwon Lee,et al.  A CFD-Based Tool for Studying Temperature in Rack-Mounted Servers , 2008, IEEE Transactions on Computers.

[20]  Huang Zhilin,et al.  Numerical Simulation and Comparative Analysis of Different Airflow Distributions in Data Centers , 2017 .

[21]  Sandeep K. S. Gupta,et al.  Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach , 2008, IEEE Transactions on Parallel and Distributed Systems.

[22]  Bahgat Sammakia,et al.  Airflow and temperature distribution optimization in data centers using artificial neural networks , 2013 .

[23]  Xiaofeng Niu,et al.  Recent advancements on thermal management and evaluation for data centers , 2018, Applied Thermal Engineering.

[24]  Hosein Moazamigoodarzi,et al.  Real-time temperature predictions in IT server enclosures , 2018, International Journal of Heat and Mass Transfer.

[25]  F. Haghighat,et al.  Zonal Modeling for Simulating Indoor Environment of Buildings: Review, Recent Developments, and Applications , 2007 .

[26]  Geoffrey C. Fox,et al.  Task scheduling with ANN-based temperature prediction in a data center: a simulation-based study , 2011, Engineering with Computers.

[27]  Gerard F. Jones,et al.  A review of data center cooling technology, operating conditions and the corresponding low-grade waste heat recovery opportunities , 2014 .

[28]  Michael M. Ohadi,et al.  Optimum Cooling of Data Centers , 2014 .

[29]  I. Puri,et al.  Influence of cooling architecture on data center power consumption , 2019, Energy.

[30]  Bahgat Sammakia,et al.  A Compact Thermal Model for Data Center Analysis using the Zonal Method , 2013 .

[31]  Cheng-Xian Lin,et al.  An evaluation of turbulence and tile models at server rack level for data centers , 2019, Building and Environment.

[32]  Yogendra Joshi,et al.  Energy Efficient Thermal Management of Data Centers , 2012 .

[33]  Michael Jonas,et al.  Using transient thermal models to predict cyberphysical phenomena in data centers , 2013, Sustain. Comput. Informatics Syst..

[34]  Jinkyun Cho,et al.  Evaluation of air distribution system's airflow performance for cooling energy savings in high-density data centers , 2014 .

[35]  Canbing Li,et al.  Optimizing energy consumption for data centers , 2016 .

[36]  P. Roache Perspective: A Method for Uniform Reporting of Grid Refinement Studies , 1994 .

[37]  Zenggang Xiong,et al.  A smart coordinated temperature feedback controller for energy-efficient data centers , 2019, Future Gener. Comput. Syst..

[38]  Qinghui Tang,et al.  Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters , 2006, 2006 Fourth International Conference on Intelligent Sensing and Information Processing.

[39]  Douglas G. Down,et al.  ALTM: Adaptive learning-based thermal model for temperature predictions in data centers , 2019, 2019 IEEE Sustainability through ICT Summit (StICT).

[40]  Roger R. Schmidt,et al.  A hybrid lumped capacitance-CFD model for the simulation of data center transients , 2014 .

[41]  Hafiz M. Daraghmeh,et al.  A review of current status of free cooling in datacenters , 2017 .

[42]  Shahaboddin Shamshirband,et al.  Sustainable Cloud Data Centers: A survey of enabling techniques and technologies , 2016 .

[43]  Xiaohong Jiang,et al.  Holistic energy and failure aware workload scheduling in Cloud datacenters , 2018, Future Gener. Comput. Syst..

[44]  Hamza Salih Erden,et al.  Determination of the Lumped-Capacitance Parameters of Air-Cooled Servers Through Air Temperature Measurements , 2014 .

[45]  Kishan G. Mehrotra,et al.  Elements of artificial neural networks , 1996 .

[46]  Minami Yoda,et al.  Artificial Neural Network Based Prediction of Temperature and Flow Profile in Data Centers , 2018, 2018 17th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm).

[47]  Pedro D. Gaspar,et al.  Improving Airflow and Thermal Distribution in a Real Data Centre Room Through Computational Fluid Dynamics Modeling , 2019, 2019 8th International Conference on Industrial Technology and Management (ICITM).

[48]  Y. Joshi,et al.  Comparison of data driven modeling approaches for temperature prediction in data centers , 2019, International Journal of Heat and Mass Transfer.

[49]  Anthony Rowe,et al.  Data-driven Thermal Model Inference with ARMAX, in Smart Environments, based on Normalized Mutual Information , 2018, 2018 Annual American Control Conference (ACC).

[50]  Madhusudan K. Iyengar,et al.  Reduced Order Thermal Modeling of Data Centers via Distributed Sensor Data , 2012 .

[51]  Ran Zhang,et al.  Air Flow Measurement and Management for Improving Cooling and Energy Efficiency in Raised-Floor Data Centers: A Survey , 2018, IEEE Access.

[52]  Gokhan Memik,et al.  Machine Learning-Based Temperature Prediction for Runtime Thermal Management Across System Components , 2018, IEEE Transactions on Parallel and Distributed Systems.

[53]  Guangming Chen,et al.  Enclosed aisle effect on cooling efficiency in small scale data center , 2017 .

[54]  Kwang Ho Lee,et al.  Simplified calculation method for design cooling loads in underfloor air distribution (UFAD) systems , 2011 .

[55]  S. A. Nada,et al.  CFD investigations of data centers’ thermal performance for different configurations of CRACs units and aisles separation , 2016 .