Energy-efficient allocation of computing node slots in HPC clusters through parameter learning and hybrid genetic fuzzy system modeling

Decision-making mechanisms for online allocation of computer node slots in HPC clusters are commonly based on simple knowledge-based systems comprised of individual sets of if–then rules. In contrast with previous works where these rules were designed using expert knowledge, two different types of evolutionary learning algorithms are compared in this paper. In the first case, some of the numerical parameters defining a human-designed knowledge base are tuned. In the second case, a genetic fuzzy system evolves a partial rule set that, after being combined with some expert rules, conforms the most appropriate knowledge base for a given load scenario. In both cases, the proposed approaches optimize the quality of service and the number of node reconfigurations along with the energy consumption. An experimental study has been made using actual workloads from the Scientific Modeling Cluster at Oviedo University, and statistical evidence was found supporting the adoption of the new learning system.

[1]  Jeffrey F. Naughton,et al.  On energy management, load balancing and replication , 2010, SGMD.

[2]  Ulrich Kremer,et al.  The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.

[3]  Juan Li,et al.  An overview of energy efficiency techniques in cluster computing systems , 2013, Cluster Computing.

[4]  Hisao Ishibuchi,et al.  Classification and modeling with linguistic information granules - advanced approaches to linguistic data mining , 2004, Advanced information processing.

[5]  Dong Li,et al.  Power-aware MPI task aggregation prediction for high-end computing systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[6]  Rong Ge,et al.  CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[7]  Kang G. Shin,et al.  Profiling Software for Energy Consumption , 2012, 2012 IEEE International Conference on Green Computing and Communications.

[8]  Ruud Haring,et al.  The Blue Gene/Q Compute chip , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).

[9]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[10]  Laurent Lefèvre,et al.  A Runtime Framework for Energy Efficient HPC Systems without a Priori Knowledge of Applications , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[11]  Xiao Qin,et al.  Improving Energy-Efficiency of Computational Grids via Scheduling , 2010 .

[12]  Soo Dong Kim,et al.  Modeling QoS Attributes and Metrics for Evaluating Services in SOA Considering Consumers' Perspective as the First Class Requirement , 2007 .

[13]  Jordi Torres,et al.  Towards energy-aware scheduling in data centers using machine learning , 2010, e-Energy.

[14]  Vicente Hernández,et al.  An Energy Manager for High Performance Computer Clusters , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.

[15]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[16]  M. Ancona,et al.  Cluster computing , 2003, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings..

[17]  Yu Zeng,et al.  Automatic Energy Status Controlling with Dynamic Voltage Scaling in Power-Aware High Performance Computing Cluster , 2011, 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[18]  Richard E. Brown,et al.  Report to Congress on Server and Data Center Energy Efficiency: Public Law 109-431 , 2008 .

[19]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  Joaquin Entrialgo,et al.  A self-managing strategy for balancing response time and power consumption in heterogeneous server clusters , 2010, 2010 International Conference on Electronics and Information Engineering.

[21]  J. Koomey,et al.  Report to Congress on Server and Data Center Energy Efficiency: Public Law 109-431: Appendices , 2008 .

[22]  Zhiyuan Li,et al.  A programming environment with runtime energy characterization for energy-aware applications , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[23]  Enrique V. Carrera,et al.  Load balancing and unbalancing for power and performance in cluster-based systems , 2001 .

[24]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[25]  Joaquín Entrialgo,et al.  A Technique for Self-Optimizing Scalable and Dependable Server Clusters under QoS Constraints , 2012, 2012 IEEE 11th International Symposium on Network Computing and Applications.

[26]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[27]  D.K. Lowenthal,et al.  Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[28]  Jesús Labarta,et al.  Tools for Power-Energy Modelling and Analysis of Parallel Scientific Applications , 2012, 2012 41st International Conference on Parallel Processing.

[29]  Sandeep K. S. Gupta,et al.  Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach , 2008, IEEE Transactions on Parallel and Distributed Systems.

[30]  S. Huang,et al.  Energy-Efficient Cluster Computing via Accurate Workload Characterization , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[31]  Xiaoshe Dong,et al.  An Energy-Efficient Management Mechanism for Large-Scale Server Clusters , 2007 .

[32]  Xiao Qin,et al.  Energy efficient scheduling for parallel applications on mobile clusters , 2008, Cluster Computing.

[33]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[34]  Rajkumar Buyya,et al.  Cluster Computing: High-Performance, High-Availability, and High-Throughput Processing on a Network of Computers , 2006, Handbook of Nature-Inspired and Innovative Computing.

[35]  Rajarshi Das,et al.  Autonomic multi-agent management of power and performance in data centers , 2008, AAMAS.

[36]  George Forman,et al.  Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center , 2007, USENIX Annual Technical Conference.