Parallelisation of equation-based simulation programs on heterogeneous computing systems

Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times. How to cite this article Nikoli c (2018), Parallelisation of equation-based simulation programs on heterogeneous computing systems. PeerJ Comput. Sci. 4:e160; DOI 10.7717/peerj-cs.160 Submitted 22 March 2018 Accepted 20 July 2018 Published 13 August 2018 Corresponding author Dragan D. Nikoli c, dnikolic@daetools.com Academic editor Linda Petzold Additional Information and Declarations can be found on page 30 DOI 10.7717/peerj-cs.160 Copyright 2018 Nikolic Distributed under Creative Commons CC-BY 4.0 Subjects Distributed and Parallel Computing, Scientific Computing and Simulation

[1]  Tamara G. Kolda,et al.  An overview of the Trilinos project , 2005, TOMS.

[2]  Dragan D. Nikolic DAE Tools: equation-based object-oriented modelling, simulation and optimisation software , 2016, PeerJ Comput. Sci..

[3]  Peter Fritzson,et al.  Modelica - A Unified Object-Oriented Language for System Modelling and Simulation , 1998, ECOOP.

[4]  W. Bangerth,et al.  deal.II—A general-purpose object-oriented finite element library , 2007, TOMS.

[5]  Xiaoye S. Li,et al.  An overview of SuperLU: Algorithms, implementation, and user interface , 2003, TOMS.

[6]  Patrick Knupp,et al.  Code Verification by the Method of Manufactured Solutions , 2000 .

[7]  Andrea Walther,et al.  Getting Started with ADOL-C , 2009, Combinatorial Scientific Computing.

[8]  Hilding Elmqvist,et al.  DYMOLA - A Structured Model Language for Large Continuous Systems , 1978 .

[9]  Olaf Schenk,et al.  Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization , 2007, Comput. Optim. Appl..

[10]  Karl-Erik Årzén,et al.  Modeling and optimization with Optimica and JModelica.org - Languages and tools for solving large-scale dynamic optimization problems , 2010, Comput. Chem. Eng..

[11]  Benjamin S. Kirk,et al.  Library for Parallel Adaptive Mesh Refinement / Coarsening Simulations , 2006 .

[12]  Paul I. Barton,et al.  Modeling of combined discrete/continuous processes , 1994 .

[13]  Michael Baldea,et al.  Equation‐oriented simulation and optimization of process flowsheets incorporating detailed spiral‐wound multistream heat exchanger models , 2017 .

[14]  Peter Piela Ascend: an object-oriented computer environment for modeling and analysis , 1989 .

[15]  Carol S. Woodward,et al.  Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers , 2020, ACM Trans. Math. Softw..

[16]  Adrian Pop,et al.  The OpenModelica Modeling, Simulation, and Development Environment , 2005 .