Scientific Workflows at DataWarp-Speed: Accelerated Data-Intensive Science Using NERSC's Burst Buffer

Emerging exascale systems have the ability to accelerate the time-to-discovery for scientific workflows. However, as these workflows become more complex, their generated data has grown at an unprecedented rate, making I/O constraints challenging. To address this problem advanced memory hierarchies, such as burst buffers, have been proposed as intermediate layers between the compute nodes and the parallel file system. In this paper, we utilize Cray DataWarp burst buffer coupled with in-transit processing mechanisms, to demonstrate the advantages of advanced memory hierarchies in preserving traditional coupled scientific workflows. We consider in-transit workflow which couples simulation of subsurface flows with on-the-fly flow visualization. With respect to the proposed workflow, we study the performance of the Cray DataWarp Burst Buffer and provide a comparison with the Lustre parallel file system.

[1]  John Shalf,et al.  Exascale Computing Trends: Adjusting to the "New Normal"' for Computer Architecture , 2013, Computing in Science & Engineering.

[2]  Kwan-Liu Ma,et al.  In-situ processing and visualization for ultrascale simulations , 2007 .

[3]  Li Yang,et al.  Coupled Processes in a Fractured Reactive System , 2018, Geological Carbon Storage.

[4]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[5]  Karsten Schwan,et al.  Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[6]  Robert B. Ross,et al.  Challenges and Considerations for Utilizing Burst Buffers in High-Performance Computing , 2015, ArXiv.

[7]  David Trebotich,et al.  An investigation of the effect of pore scale flow on average geochemical reaction rates using direct numerical simulation , 2012 .

[8]  C. Steefel,et al.  Pore-scale controls on calcite dissolution rates from flow-through laboratory and numerical experiments. , 2014, Environmental science & technology.

[9]  Surendra Byna,et al.  Accelerating Science with the NERSC Burst Buffer Early User Program , 2016 .

[10]  Mark F. Adams,et al.  Chombo Software Package for AMR Applications Design Document , 2014 .

[11]  Karsten Schwan,et al.  PreDatA – preparatory data analytics on peta-scale machines , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[12]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[13]  Mark F. Adams,et al.  High-Resolution Simulation of Pore-Scale Reactive Transport Processes Associated with Carbon Sequestration , 2014, Computing in Science & Engineering.

[14]  D. Trebotich,et al.  An adaptive finite volume method for the incompressible Navier–Stokes equations in complex geometries , 2015 .

[15]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[16]  Akira Kageyama,et al.  An approach to exascale visualization: Interactive viewing of in-situ visualization , 2013, Comput. Phys. Commun..