Energy Conservation Through Cloned Execution Of Simulations

High-performance computing facilities used for scientific computing draw enormous energy, some of them consuming many megawatt-hours. Saving the energy consumption of computations on such facilities can dramatically reduce the total cost of their operation and help reduce environmental effects. Here, we focus on a way to reduce energy consumption in many ensembles of simulations. Using the method of simulation cloning to exploit parallelism while also significantly conserving the computational and memory requirements, we perform a detailed empirical study of energy consumed on a large supercomputer consisting of hardware accelerator cards (graphical processing units, GPUs). We build on previous insights from mathematical analysis and implementation of cloned simulations that result in computational and memory savings by several orders-of-magnitude. Using instrumentation to track the power drawn by thousands of accelerator cards, we report significant aggregate energy savings from cloned simulations.

[1]  Michael Lang,et al.  Power usage of production supercomputers and production workloads , 2016, Concurr. Comput. Pract. Exp..

[2]  Shaolei Ren,et al.  Energy-efficient design of real-time stream mining systems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Philip Heidelberger,et al.  Discrete event simulations and parallel processing: statistical properties , 1988 .

[4]  Martin Rinard,et al.  Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .

[5]  Alois Ferscha,et al.  Parallel and Distributed Simulation , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[6]  J. Balbi,et al.  Dynamic modelling of fire spread across a fuel bed , 1999 .

[7]  Laurent Lefèvre,et al.  Exploiting performance counters to predict and improve energy performance of HPC systems , 2014, Future Gener. Comput. Syst..

[8]  Richard M. Fujimoto,et al.  Cloning: a novel method for interactive parallel simulation , 1997, WSC '97.

[9]  Thomas Scogland,et al.  Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWG , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[10]  Stephen John Turner,et al.  Cloning Agent-based Simulation on GPU , 2015, SIGSIM-PADS.

[11]  Azer Bestavros,et al.  Multi-version Speculative Concurrency Control with Delayed Commit , 1993 .

[12]  Sudip K. Seal,et al.  Discrete event modeling and massively parallel execution of epidemic outbreak phenomena , 2012, Simul..