Asynchronous in Situ Processing with Gromacs: Taking Advantage of GPUs

Numerical simulations using supercomputers are producing an ever growing amount of data. Efficient production and analysis of these data are the key to future discoveries. The in situ paradigm is emerging as a promising solution to avoid the I/O bottleneck encountered in the file system for both the simulation and the analytics by treating the data as soon as they are produced in memory. Various strategies and implementations have been proposed in the last years to support in situ treatments with a low impact on the simulation performance. Yet, little efforts have been made when it comes to perform in situ analytics with hybrid simulations supporting accelerators like GPUs. In this article, we propose a study of the in situ strategies with Gromacs, a molecular dynamic simulation code supporting multi-GPUs, as our application target. We specifically focus on the computational resources usage of the machine by the simulation and the in situ analytics. We finally extend the usual in situ placement strategies to the case of in situ analytics running on a GPU and study their impact on both Gromacs performance and the resource usage of the machine. We show in particular that running in situ analytics on the GPU can be a more efficient solution than on the CPU especially when the CPU is the bottleneck of the simulation.

[1]  Jeremy S. Meredith,et al.  Parallel in situ coupling of simulation with a fully featured visualization system , 2011, EGPGV '11.

[2]  Karsten Schwan,et al.  GoldRush: Resource efficient in situ scientific data analytics using fine-grained interference aware execution , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[3]  Michael E. Papka,et al.  Toward simulation-time data analysis and I/O acceleration on leadership-class systems , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[4]  Karsten Schwan,et al.  FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[5]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[6]  P. Balaji,et al.  GePSeA: A General-Purpose Software Acceleration Framework for Lightweight Task Offloading , 2009, 2009 International Conference on Parallel Processing.

[7]  Kenneth Moreland,et al.  Sandia National Laboratories , 2000 .

[8]  Karsten Schwan,et al.  In-situ I/O processing: a case for location flexibility , 2011, PDSW '11.

[9]  Karsten Schwan,et al.  Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) , 2008, CLADE '08.

[10]  Karsten Schwan,et al.  PreDatA – preparatory data analytics on peta-scale machines , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[11]  Ray W. Grout,et al.  Ultrascale Visualization In Situ Visualization for Large-Scale Combustion Simulations , 2010 .

[12]  Robert Sisneros,et al.  Damaris/Viz: A nonintrusive, adaptable and user-friendly in situ visualization framework , 2013, 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV).

[13]  R. Samtaney,et al.  Grid -Based Parallel Data Streaming implemented for the Gyrokinetic Toroidal Code , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[14]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[15]  T. Tu,et al.  From Mesh Generation to Scientific Visualization: An End-to-End Approach to Parallel Supercomputing , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[16]  Klaus Schulten,et al.  GPU-accelerated molecular modeling coming of age. , 2010, Journal of molecular graphics & modelling.

[17]  Klaus Schulten,et al.  Fast Visualization of Gaussian Density Surfaces for Molecular Dynamics and Particle System Trajectories , 2012, EuroVis.

[18]  Peter M. Kasson,et al.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit , 2013, Bioinform..

[19]  Jérémie Allard,et al.  FlowVR: A Middleware for Large Scale Virtual Reality Applications , 2004, Euro-Par.

[20]  John E. Stone,et al.  Fast analysis of molecular dynamics trajectories with graphics processing units - Radial distribution function histogramming , 2011, J. Comput. Phys..

[21]  Klaus Schulten,et al.  Immersive Molecular Visualization and Interactive Modeling with Commodity Hardware , 2010, ISVC.

[22]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[23]  R. Hagan,et al.  Multi-GPU Load Balancing for In-situ Visualization , 2011 .

[24]  Bruno Raffin,et al.  A Flexible Framework for Asynchronous in Situ and in Transit Analytics for Scientific Simulations , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[25]  Fei Meng,et al.  Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[26]  Klaus Schulten,et al.  GPU-accelerated analysis and visualization of large structures solved by molecular dynamics flexible fitting. , 2014, Faraday discussions.

[27]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[28]  Klaus Schulten,et al.  GPU-accelerated molecular visualization on petascale supercomputing platforms , 2013, UltraVis@SC.

[29]  John Biddiscombe,et al.  Computational Steering and Parallel Online Monitoring Using RMA through the HDF5 DSM Virtual File Driver , 2011, ICCS.

[30]  Klaus Schulten,et al.  Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics , 2013, Nature.

[31]  Sébastien Limet,et al.  Interactive Molecular Dynamics: Scaling up to Large Systems , 2013, ICCS.

[32]  Burkard Wordenweber Surface Triangulation for Picture Production , 1983, IEEE Computer Graphics and Applications.

[33]  Benjamin Lorendeau,et al.  In-Situ visualization in fluid mechanics using Catalyst: A case study for Code Saturne , 2013, 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV).

[34]  Franck Cappello,et al.  Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O , 2012, 2012 IEEE International Conference on Cluster Computing.

[35]  Kenneth Moreland Oh, $#*@! Exascale! The Effect of Emerging Architectures on Scientific Discovery , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.