Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster

Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and machines are starting to hit that limit. In addition, the cost of energy has become increasingly significant, and the heat produced by higher-energy components tends to reduce their reliability. One way to reduce power (and therefore energy) requirements is to use high-performance cluster nodes that are frequency- and voltage-scalable (e.g., AMD-64 processors).The problem we address in this paper is: given a target program, a power-scalable cluster, and an upper limit for energy consumption, choose a schedule (number of nodes and CPU frequency) that simultaneously (1) satisfies an external upper limit for energy consumption and (2) minimizes execution time. There are too many schedules for an exhaustive search. Therefore, we find a schedule through a novel combination of performance modeling, performance prediction, and program execution. Using our technique, we are able to find a near-optimal schedule for all of our benchmarks in just a handful of partial program executions.

[1]  Ricardo Bianchini,et al.  Conserving disk energy in network servers , 2003, ICS '03.

[2]  Xin Yuan,et al.  Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.

[3]  Feng Pan,et al.  Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[4]  Wu-chun Feng,et al.  High-Density Computing: A 240-Processor Beowulf in One Cubic Meter , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[5]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[6]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[7]  Michael Voss,et al.  High-level adaptive program optimization with ADAPT , 2001, PPoPP '01.

[8]  E. N. Elnozahy,et al.  Energy Conservation Policies for Web Servers , 2003, USENIX Symposium on Internet Technologies and Systems.

[9]  Michael C. Huang,et al.  Positional adaptation of processors: application to energy reduction , 2003, ISCA '03.

[10]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[11]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[12]  Y. Charlie Hu,et al.  Program counter based techniques for dynamic power management , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[13]  R. Viswanath Thermal Performance Challenges from Silicon to Systems , 2000 .

[14]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[15]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[16]  Enrique V. Carrera,et al.  Load balancing and unbalancing for power and performance in cluster-based systems , 2001 .

[17]  Sandhya Dwarkadas,et al.  Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations , 2001, PPoPP '01.

[18]  Vincent W. Freeh,et al.  Dynamic Power Management using Feedback , 2002 .

[19]  Ulrich Kremer,et al.  The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.

[20]  Kevin Skadron,et al.  Power-aware QoS management in Web servers , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[21]  Scott Shenker,et al.  Scheduling for reduced CPU energy , 1994, OSDI '94.

[22]  R. C. Whaley,et al.  Timing high performance kernels through empirical compilation , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[23]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005 .

[24]  Amin Vahdat,et al.  Currentcy: A Unifying Abstraction for Expressing Energy Management Policies , 2003, USENIX Annual Technical Conference, General Track.

[25]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[26]  Frank Mueller,et al.  Predicting parallel applications' performance across platforms using partial execution , 2005 .

[27]  Michael L. Scott,et al.  Energy efficiency through burstiness , 2003, 2003 Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications.

[28]  Carla Schlatter Ellis,et al.  The case for higher-level power management , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[29]  Rong Ge,et al.  Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[30]  Jaspal Subhlok,et al.  Optimal mapping of sequences of data parallel tasks , 1995, PPOPP '95.

[31]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[32]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[33]  Mahmut T. Kandemir,et al.  Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[34]  Ken Kennedy,et al.  Automatic data layout for distributed-memory machines , 1998, TOPL.

[35]  Yuanyuan Zhou,et al.  Reducing Energy Consumption of Disk Storage Using Power-Aware Cache Management , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[36]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[37]  Frank Mueller,et al.  Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[38]  Mahadev Satyanarayanan,et al.  PowerScope: a tool for profiling the energy usage of mobile applications , 1999, Proceedings WMCSA'99. Second IEEE Workshop on Mobile Computing Systems and Applications.

[39]  Amin Vahdat,et al.  Currentcy: Unifying Policies for Resource Management , 2002 .

[40]  Jeffrey S. Vetter,et al.  Statistical scalability analysis of communication operations in distributed applications , 2001, PPoPP '01.

[41]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.