A feasibility analysis of power-awareness and energy minimization in modern interconnects for high-performance computing

High-performance computing (HPC) systems consume a significant amount of power, resulting in high operational costs, reduced reliability, and wasting of natural resources. Therefore, power consumption has become an increasingly important design constraint in high-performance clusters. In this regard, research on power-aware HPC has emerged. While most research has focused at understanding and utilizing applicationspsila behavior to scale down the CPU for energy savings, this paper demonstrates the positive impact of modern interconnects in delivering energy-efficiency in high-performance clusters. In this work, we first present the power-performance profiles of the Myrinet-2000 and Quadrics QsNetII at the user-level and MPI-level in comparison to a traditional, non-offloaded Gigabit Ethernet. Such information enables us to devise a power-aware MPI runtime library that automatically and transparently performs message segmentation and re-assembly in order to increase energy savings. Secondly, by designing and evaluating a number of all-gather collectives, we argue that it is possible to increase the energy-efficiency of a cluster by optimizing its messaging layers.

[1]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[2]  Rong Ge,et al.  Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[3]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[4]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[5]  Keith D. Underwood,et al.  A comparison of 4X InfiniBand and Quadrics Elan-4 technologies , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[6]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[7]  José E. Moreira,et al.  Unlocking the Performance of the BlueGene/L Supercomputer , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[8]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005 .

[9]  Pedro López,et al.  Dynamic power saving in fat-tree interconnection networks using on/off links , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[10]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[11]  David K. Lowenthal,et al.  Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster , 2006, PPoPP '06.

[12]  Trevor Mudge Power: A First Class Design Constraint for Future Architecture and Automation , 2000, HiPC.

[13]  Dhabaleswar K. Panda,et al.  Microbenchmark performance comparison of high-speed cluster interconnects , 2004, IEEE Micro.

[14]  Mahmut T. Kandemir,et al.  A holistic approach to designing energy-efficient cluster interconnects , 2005, IEEE Transactions on Computers.

[15]  Wu-chun Feng,et al.  High-Density Computing: A 240-Processor Beowulf in One Cubic Meter , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[16]  Mary Jane Irwin,et al.  Link Shutdown Opportunities During Collective Communications in 3-D Torus Nets , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[17]  Hiroshi Nakamura,et al.  A High Performance Cluster System Design by Adaptie Power Control , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[18]  Ying Qian,et al.  RDMA-based and SMP-aware Multi-port All-Gather on Multi-rail QsNet^II SMP Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[19]  Wu-chun Feng,et al.  A Feasibility Analysis of Power Awareness in Commodity-Based High-Performance Clusters , 2005, 2005 IEEE International Conference on Cluster Computing.

[20]  Rong Ge,et al.  Power and energy profiling of scientific applications on distributed systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[21]  Ryan E. Grant,et al.  Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[22]  S. Matsuoka,et al.  MegaProto: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[23]  Freeman L. Rawson,et al.  Scaling and Packing on a Chip Multiprocessor , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[24]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[25]  Ricardo Bianchini,et al.  Power and energy management for server systems , 2004, Computer.

[26]  Ying Qian,et al.  An evaluation of the Myrinet/GM2 two-port networks , 2004, 29th Annual IEEE International Conference on Local Computer Networks.

[27]  Ahmad Afsahi,et al.  10-Gigabit iWARP Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[28]  Dimitrios S. Nikolopoulos,et al.  Online strategies for high-performance power-aware thread execution on emerging multiprocessors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[29]  Mitsuhisa Sato,et al.  Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[30]  Ulrich Kremer,et al.  The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.

[31]  J. Nieplocha,et al.  QSNET/sup II/: defining high-performance network design , 2005, IEEE Micro.

[32]  D.K. Lowenthal,et al.  Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[33]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.