Korali: a High-Performance Computing Framework for Stochastic Optimization and Bayesian Uncertainty Quantification

We present a modular, open-source, high-performance computing framework for data-driven Bayesian uncertainty quantification and stochastic optimization. The proposed framework (Korali) is well suited for the non-intrusive sampling of computationally demanding engineering and scientific models. The framework's distributed-execution engine allows for the efficient execution of massively-parallel computational models while providing fault tolerance and load balancing mechanisms. In this paper, we present our framework's design principles and explain its flexibility in allowing scientists to deploy stochastic methods at scale. We demonstrate the capabilities of Korali for Bayesian inference and optimization studies using existing high-performance software such as LAMMPS (CPU-Based) and Mirheo (GPU-Based) and show scaling efficiently on up to 4096 nodes of the CSCS Piz Daint supercomputer.

[1]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[2]  Petros Koumoutsakos,et al.  Remember and Forget for Experience Replay , 2018, ICML.

[3]  Peter V. Coveney,et al.  FabSim: Facilitating computational research through automation on large-scale and distributed e-infrastructures , 2015, Comput. Phys. Commun..

[4]  Petros Koumoutsakos,et al.  Bayesian selection for coarse-grained models of liquid water , 2018, Scientific Reports.

[5]  Michael S. Eldred,et al.  DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis. Version 5.0, user's reference manual. , 2010 .

[6]  Petros Koumoutsakos,et al.  Load Balancing in Large Scale Bayesian Inference , 2020, PASC.

[7]  Mark A. Girolami,et al.  BioBayes: A software package for Bayesian inference in systems biology , 2008, Bioinform..

[8]  S. Suresh,et al.  Nonlinear elastic and viscoelastic deformation of the human red blood cell with optical tweezers. , 2004, Mechanics & chemistry of biosystems : MCB.

[9]  George Em Karniadakis,et al.  A multiscale red blood cell model with accurate mechanics, rheology, and dynamics. , 2010, Biophysical journal.

[10]  Tim Bray,et al.  Internet Engineering Task Force (ietf) the Javascript Object Notation (json) Data Interchange Format , 2022 .

[11]  Hamid Arabnejad,et al.  Introducing VECMAtk - Verification, Validation and Uncertainty Quantification for Multiscale and HPC Simulations , 2019, ICCS.

[12]  Costas Papadimitriou,et al.  Bayesian Annealed Sequential Importance Sampling: An Unbiased Version of Transitional Markov Chain Monte Carlo , 2018 .

[14]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[15]  Laxmikant V. Kalé,et al.  Multi-Level Load Balancing with an Integrated Runtime Approach , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[16]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[17]  H N Graham,et al.  Maté , 2019, Progress in clinical and biological research.

[18]  Costas Papadimitriou,et al.  Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models , 2015, J. Comput. Phys..

[19]  Gaute T. Einevoll,et al.  Uncertainpy: A Python Toolbox for Uncertainty Quantification and Sensitivity Analysis in Computational Neuroscience , 2018, bioRxiv.

[20]  J. Conrad,et al.  Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module , 2017, 1705.07959.

[21]  Franck Nicoud,et al.  How should the optical tweezers experiment be used to characterize the red blood cell membrane mechanics? , 2017, Biomechanics and Modeling in Mechanobiology.

[22]  Petros Koumoutsakos,et al.  Mirheo: High-Performance Mesoscale Simulations for Microfluidics , 2019, Comput. Phys. Commun..

[23]  Gilles Clermont,et al.  APT-MCMC, a C++/Python implementation of Markov Chain Monte Carlo for parameter identification , 2018, Comput. Chem. Eng..

[24]  Scott B. Baden,et al.  UPC++: A High-Performance Communication Framework for Asynchronous Computation , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[25]  J. Ching,et al.  Transitional Markov Chain Monte Carlo Method for Bayesian Model Updating, Model Class Selection, and Model Averaging , 2007 .

[26]  Erika Cule,et al.  ABC-SysBio—approximate Bayesian computation in Python with GPU support , 2010, Bioinform..

[27]  Costas Papadimitriou,et al.  Fusing heterogeneous data for the calibration of molecular dynamics force fields using hierarchical Bayesian models. , 2016, The Journal of chemical physics.

[28]  Sophia Lefantzi,et al.  DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis. , 2011 .

[29]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[30]  Chwee Teck Lim,et al.  Connections between single-cell biomechanics and human disease states: gastrointestinal cancer and malaria. , 2005, Acta biomaterialia.

[31]  Thomas M. Fischer,et al.  Threshold shear stress for the transition between tumbling and tank-treading of red blood cells in shear flow: dependence on the viscosity of the suspending medium , 2013, Journal of Fluid Mechanics.

[32]  Tom Heskes,et al.  BCM: toolkit for Bayesian analysis of Computational Models using samplers , 2016, BMC Systems Biology.

[33]  Scott B. Baden,et al.  MATE, a Unified Model for Communication-Tolerant Scientific Applications , 2018, LCPC.

[34]  Karl W. Schulz,et al.  The Parallel C++ Statistical Library 'QUESO': Quantification of Uncertainty for Estimation, Simulation and Optimization , 2011, Euro-Par Workshops.

[35]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[36]  Jonathan M. Cornell,et al.  GAMBIT: the global and modular beyond-the-standard-model inference tool , 2017, The European Physical Journal C.

[37]  Margaret H. Wright,et al.  The opportunities and challenges of exascale computing , 2010 .