GPU-Accelerated Molecular Dynamics: Energy Consumption and Performance

Energy consumption of hybrid systems is an actual problem of modern high-performance computing. The trade-off between power consumption and performance becomes more and more prominent. In this paper, we discuss the energy and power efficiency of two modern hybrid minicomputers Jetson TK1 and TX1. We use the Empirical Roofline Tool to obtain peak performance data and the molecular dynamics package LAMMPS as an example of a real-world benchmark. Using the precise wattmeter, we measure Jetsons power consumption profiles. The effectiveness of DVFS is examined as well. We determine the optimal GPU and DRAM frequencies that give the minimum energy-to-solution value.

[1]  Nam Sung Kim,et al.  GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[2]  Chi-Ying Tsui,et al.  Low power architecture design and compilation techniques for high-performance processors , 1994, Proceedings of COMPCON '94.

[3]  Tao Tang,et al.  Exploiting hierarchy parallelism for molecular dynamics on a petascale heterogeneous system , 2013, J. Parallel Distributed Comput..

[4]  Thomas Scogland,et al.  Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWG , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Vladimir V. Stegailov,et al.  Floating-point performance of ARM cores and their efficiency in classical molecular dynamics , 2016 .

[6]  Jeffry T. Russell,et al.  Software power estimation and optimization for high performance, 32-bit embedded processors , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[7]  Raffaele Tripiccione,et al.  Energy-Performance Tradeoffs for HPC Applications on Low Power Processors , 2015, Euro-Par Workshops.

[8]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[9]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[10]  Vladimir V. Stegailov,et al.  Efficiency of the Tegra K1 and X1 systems-on-chip for classical molecular dynamics , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[11]  Alexander S. Minkin,et al.  GPU implementations of some many-body potentials for molecular dynamics simulations , 2017, Adv. Eng. Softw..

[12]  Vladimir V. Stegailov,et al.  Efficiency of classical molecular dynamics algorithms on supercomputers , 2016 .

[13]  Lei Yang,et al.  Accurate online power estimation and automatic battery behavior based power model generation for smartphones , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[14]  Peng Wang,et al.  Implementing molecular dynamics on hybrid high performance computers - short range forces , 2011, Comput. Phys. Commun..

[15]  Peter T Cummings,et al.  Probing the Statistical Validity of the Ductile-to-Brittle Transition in Metallic Nanowires Using GPU Computing. , 2013, Journal of chemical theory and computation.

[16]  Trung Dac Nguyen,et al.  GPU-accelerated Tersoff potentials for massively parallel Molecular Dynamics simulations , 2017, Comput. Phys. Commun..

[17]  Eduard Ayguadé,et al.  The Mont-Blanc Prototype: An Alternative Approach for HPC Systems , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Pak Lui,et al.  Strong scaling of general-purpose molecular dynamics simulations on GPUs , 2014, Comput. Phys. Commun..

[19]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[20]  Vladimir V. Stegailov,et al.  HPC Hardware Efficiency for Quantum and Classical Molecular Dynamics , 2015, PaCT.

[21]  Torsten Hoefler,et al.  Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .

[22]  Eric A. Freudenthal,et al.  Preliminary Investigation of Mobile System Features Potentially Relevant to HPC , 2016, 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC).

[23]  Samuel Williams,et al.  Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis , 2014, PMBS@SC.

[24]  Margaret Martonosi,et al.  Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.

[25]  Leonel Sousa,et al.  Energy‐aware mechanism for stencil‐based MPDATA algorithm with constraints , 2017, Concurr. Comput. Pract. Exp..

[26]  Tarcísio M. Rocha Filho Molecular dynamics for long-range interacting systems on graphic processing units , 2014, Comput. Phys. Commun..

[27]  Aranka Derzsi,et al.  Shear Viscosity of Liquid‐Phase Yukawa Plasmas from Molecular Dynamics Simulations on Graphics Processing Units , 2012 .

[28]  Lizy Kurian John,et al.  Run-time modeling and estimation of operating system power consumption , 2003, SIGMETRICS '03.

[29]  Patricia J. Teller,et al.  Cross-Accelerator Performance Profiling , 2016, XSEDE.

[30]  Hong Fu,et al.  Accelerating modified Shepard interpolated potential energy calculations using graphics processing units , 2013, Comput. Phys. Commun..

[31]  Tao Tang,et al.  MIC acceleration of short-range molecular dynamics simulations , 2013, COSMIC '13.

[32]  Igor V. Morozov,et al.  Molecular dynamics simulations of the relaxation processes in the condensed matter on GPUs , 2011, Comput. Phys. Commun..

[33]  Alexander Mendiburu,et al.  A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing , 2015, IEEE Transactions on Parallel and Distributed Systems.

[34]  Steven J. Plimpton,et al.  Implementing molecular dynamics on hybrid high performance computers - Particle-particle particle-mesh , 2012, Comput. Phys. Commun..