Optimization of electrical infrastructures at data centers through a DoE-based approach

Data centers are critical environments that provide support for a wide range of services and applications, and therefore, there is a demand in order to guarantee high availability and reliability required in these environments. This work proposes a strategy based on models, SLA contracts, maintenance policies and optimization techniques for assessing the cost and availability of electrical infrastructures hosted in data centers. The proposed optimization strategy is based on design of experiments (DoE) and uses the availability importance index in order to detect the equipment that most impacts the system’s availability and, thus, to be able to propose improvements. In addition, a hybrid modeling approach that considers the advantages of stochastic Petri nets and reliability block diagrams is adopted to assess availability. To illustrate the applicability of the proposed approach, two case studies were carried out where significant results were obtained. In the first study, where the performance of the proposed strategy was compared with the brute force algorithm, it was possible to obtain results close to the optimum ones in a fraction of the time. For example, brute force demanded more than 100 minutes to be evaluated, while the proposed strategy took only 6 seconds.

[1]  Gianni Conte,et al.  PERFORMANCE ANALYSIS OF MULTIPROCESSOR SYSTEMS , 1985 .

[2]  Sascha Bosse,et al.  Introducing Greenhouse Emissions in Cost Optimization of Fault-Tolerant Data Center Design , 2016, 2016 IEEE 18th Conference on Business Informatics (CBI).

[3]  Mohammad Ali Pourmina,et al.  A Novel Cost Optimization Method for Mobile Cloud Computing by Capacity Planning of Green Data Center With Dynamic Pricing , 2019, Canadian Journal of Electrical and Computer Engineering.

[4]  Cássio Ferreira Nogueira,et al.  MANUTENÇÃO INDUSTRIAL: IMPLEMENTAÇÃO DA MANUTENÇÃO PRODUTIVA TOTAL (TPM) INDUSTRIAL MAINTENANCE: IMPLEMENTATION OF TOTAL PRODUCTIVE MAINTENANCE (TPM) , 2012 .

[5]  Jamilson Dantas,et al.  Mercury: An Integrated Environment for Performance and Dependability Evaluation of General Systems , 2015 .

[6]  André Brinkmann,et al.  Advanced Stochastic Petri Net Modeling with the Mercury Scripting Language , 2017, VALUETOOLS.

[7]  Jianping Li,et al.  Electricy Cost Optimization of Data Center Interactive Services with UPS , 2018, 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP).

[8]  Geng Yang,et al.  Enterprise Cloud: Its Growth & Security Challenges in China , 2018, 2018 5th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2018 4th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom).

[9]  Balbir S. Dhillon,et al.  Engineering Maintenance: A Modern Approach , 2002 .

[10]  Yongsheng Bai,et al.  Availability Optimization of Two-dimensional Warranty Products Under Imperfect Preventive Maintenance , 2021, IEEE Access.

[11]  Gianfranco Balbo,et al.  Introduction to Stochastic Petri Nets , 2002, European Educational Forum: School on Formal Methods and Performance Analysis.

[12]  Maricel Adam,et al.  Methods for Reducing Energy Consumption, Optimization in Operational Data Centers , 2018, 2018 International Conference and Exposition on Electrical And Power Engineering (EPE).

[13]  R. Sturm,et al.  Foundations of Service Level Management , 2000 .

[14]  Maurizio Portolani,et al.  Data Center Fundamentals , 2003 .

[15]  William Jalby,et al.  Measuring Computer Performance , 2012, High-Performance Scientific Computing.

[16]  Günter Hommel,et al.  Towards version 4.0 of TimeNET , 2006, MMB.

[17]  Radim Briš,et al.  Evaluation of the production availability of an offshore installation by stochastic Petri nets modeling , 2013, The International Conference on Digital Technologies 2013.

[18]  Stephan Pachnicke,et al.  Hybrid electro-optical intra-data center networks tailored for different traffic classes , 2018, IEEE/OSA Journal of Optical Communications and Networking.

[19]  Robert Y. Al-Jaar Book review: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling by Raj Jain (John Wiley & Sons 1991) , 1991, SIGMETRICS 1991.

[20]  Adiel Teixeira de Almeida Modelagem multicritério para seleção de intervalos de manutenção preventiva baseada na teoria da utilidade multiatributo , 2005 .

[21]  Junlin Xiong,et al.  On multi-state system with interval-valued states under preventive maintenance and minimal repairs , 2017, 2017 Second International Conference on Reliability Systems Engineering (ICRSE).

[22]  Sharareh Taghipour,et al.  Modeling failure and maintenance effects of a system subject to multiple preventive maintenance types , 2016, 2016 Annual Reliability and Maintainability Symposium (RAMS).

[23]  Marco Ajmone Marsan,et al.  Performance models of multiprocessor systems , 1987, MIT Press series in computer systems.

[24]  Xiaolin Wang,et al.  Imperfect Preventive Maintenance Policies With Unpunctual Execution , 2020, IEEE Transactions on Reliability.

[25]  Yu Gu,et al.  Drop test simulation and DOE analysis for design optimization of microelectronics packages , 2006, 56th Electronic Components and Technology Conference 2006.

[26]  Manish Marwah,et al.  Impact analysis of maintenance policies on data center power infrastructure , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[27]  Zhiwen Luo,et al.  Optimization of the thermal environment of a small-scale data center in China , 2020, Energy.