Coupling Exascale Multiphysics Applications: Methods and Lessons Learned

With the growing computational complexity of science and the complexity of new and emerging hardware, it is time to re-evaluate the traditional monolithic design of computational codes. One new paradigm is constructing larger scientific computational experiments from the coupling of multiple individual scientific applications, each targeting their own physics, characteristic lengths, and/or scales. We present a framework constructed by leveraging capabilities such as in-memory communications, workflow scheduling on HPC resources, and continuous performance monitoring. This code coupling capability is demonstrated by a fusion science scenario, where differences between the plasma at the edges and at the core of a device have different physical descriptions. This infrastructure not only enables the coupling of the physics components, but it also connects in situ or online analysis, compression, and visualization that accelerate the time between a run and the analysis of the science content. Results from runs on Titan and Cori are presented as a demonstration.

[1]  Surendra Byna,et al.  Accelerating Science with the NERSC Burst Buffer Early User Program , 2016 .

[2]  Karsten Schwan,et al.  Flexpath: Type-Based Publish/Subscribe System for Large-Scale Science Analytics , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[3]  Franck Cappello,et al.  Fast Error-Bounded Lossy HPC Data Compression with SZ , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[4]  Scott Klasky,et al.  ADIOS Visualization Schema: A First Step Towards Improving Interdisciplinary Collaboration in High Performance Computing , 2013, 2013 IEEE 9th International Conference on e-Science.

[5]  Franck Cappello,et al.  Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[6]  Allen D. Malony,et al.  Projecting Performance Data over Simulation Geometry Using SOSflow and ALPINE , 2017, ESPT/VPA@SC.

[7]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[8]  Scott Klasky,et al.  Multilevel Techniques for Compression and Reduction of Scientific Data - The Multivariate Case , 2019, SIAM J. Sci. Comput..

[9]  Scott Klasky,et al.  Loosely Coupled In Situ Visualization: A Perspective on Why It's Here to Stay , 2015, ISAV@SC.

[10]  Franck Cappello,et al.  Z-checker: A framework for assessing lossy compression of scientific data , 2017, Int. J. High Perform. Comput. Appl..

[11]  Kesheng Wu,et al.  Visualization and Analysis for Near-Real-Time Decision Making in Distributed Workflows , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[12]  Ian T. Foster Computing Just What You Need: Online Data Analysis and Reduction at Extreme Scales , 2017, HiPC.

[13]  Scott Klasky,et al.  Compression Using Lossless Decimation: Analysis and Application , 2017, SIAM J. Sci. Comput..

[14]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[15]  Daniel S. Katz,et al.  Swift: A language for distributed parallel scripting , 2011, Parallel Comput..

[16]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[17]  J. Choi,et al.  A tight-coupling scheme sharing minimum information across a spatial interface between gyrokinetic turbulence codes , 2018, Physics of Plasmas.

[18]  William Schroeder,et al.  The Visualization Toolkit: An Object-Oriented Approach to 3-D Graphics , 1997 .

[19]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[20]  Robert Hager,et al.  A new hybrid-Lagrangian numerical scheme for gyrokinetic simulation of tokamak edge plasma , 2016, J. Comput. Phys..

[21]  Karsten Schwan,et al.  Event-based systems: opportunities and challenges at exascale , 2009, DEBS '09.

[22]  Scott Klasky,et al.  DART: a substrate for high speed asynchronous data IO , 2008, HPDC '08.

[23]  Scott Klasky,et al.  Multilevel techniques for compression and reduction of scientific data—the univariate case , 2018, Comput. Vis. Sci..

[24]  Allen D. Malony,et al.  A Scalable Observation System for Introspection and In Situ Analytics , 2016, 2016 5th Workshop on Extreme-Scale Programming Tools (ESPT).

[25]  Kwan-Liu Ma,et al.  VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures , 2016, IEEE Computer Graphics and Applications.

[26]  Arie Shoshani,et al.  Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks , 2014, Concurr. Comput. Pract. Exp..

[27]  Fan Zhang,et al.  In‐memory staging and data‐centric task placement for coupled scientific simulation workflows , 2017, Concurr. Comput. Pract. Exp..

[28]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[29]  Frank Jenko,et al.  The global version of the gyrokinetic turbulence code GENE , 2011, J. Comput. Phys..