Scalable performance bounding under multiple constrained renewable resources

In the age of exascale computing, it is crucial to provide the best possible performance under power constraints. A major part of this optimization is managing power and bandwidth intelligently in a cluster to maximize performance. There are significant improvements in the power efficiency of HPC runtimes, yet little work has explored our ability to determine the theoretical optimal performance under a give power and bandwidth bound. In this paper, we present a scalable model to identify the optimal power and bandwidth distribution such that the makespan of a program is minimized. We utilize the network flow formulation in constructing a linear program that is efficient to solve. We demonstrate the applicability of the model to MPI programs and provide synthetic benchmarks on the performance of the model.

[1]  Xin Yuan,et al.  A study of process arrival patterns for MPI collective operations , 2007, ICS.

[2]  Sebastian Fischmeister,et al.  Power Redistribution for Optimizing Performance in MPI Clusters , 2014, ArXiv.

[3]  Martin Schulz,et al.  Bounding energy consumption in large-scale MPI programs , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[4]  William J. O'Brien,et al.  Resource-Constrained Project Scheduling: Past Work and New Directions 1 , 2001 .

[5]  Sharad Malik,et al.  Compile-time dynamic voltage scaling settings: opportunities and limits , 2003, PLDI '03.

[6]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[7]  Harold S. Stone,et al.  Multiprocessor Scheduling with the Aid of Network Flow Algorithms , 1977, IEEE Transactions on Software Engineering.

[8]  Sharad Malik,et al.  Intraprogram dynamic voltage scaling: Bounding opportunities with analytic modeling , 2004, TACO.

[9]  Edward W. Davis,et al.  Project Scheduling under Resource Constraints—Historical Review and Categorization of Procedures , 1973 .

[10]  Wei Zhang,et al.  Reducing instruction cache energy consumption using a compiler-based strategy , 2004, TACO.

[11]  Bronis R. de Supinski,et al.  Adagio: making DVS practical for complex HPC applications , 2009, ICS.

[12]  Rong Ge,et al.  CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[13]  Martin Schulz,et al.  Finding the limits of power-constrained application performance , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Sharad Malik,et al.  Bounds on power savings using runtime dynamic voltage scaling: an exact algorithm and a linear-time heuristic approximation , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..