Comparing CUDA, OpenCL and OpenGL Implementations of the Cardiac Monodomain Equations

Computer simulations of cardiac electrophysiology are a helpful tool in the study of bioelectric activity of the heart. The cardiac monodomain model comprises a nonlinear system of partial differential equations and its numerical solution represents a very intensive computational task due to the required fine spatial and temporal resolution. Recent studies have shown that the use of GPU as a general purpose processor can greatly improve the performance of simulations. The aim of this work is to study the performance of different GPU programming interfaces for the solution of the cardiac monodomain equations. Three different GPU implementations are compared, OpenGL, NVIDIA CUDA and OpenCL, to a CPU multicore implementation that uses OpenMP. The OpenGL approach showed to be the fastest with a speedup of 446 (compared to the multicore implementation) for the solution of the nonlinear system of ordinary differential equations (ODEs) associated to the solution of the cardiac model, whereas CUDA was the fastest for the numerical solution of the parabolic partial differential equation with a speedup of 8. Although OpenCL provides code portability between different accelerators, the OpenCL version was slower for the solution of the parabolic equation and as fast as CUDA for the solution of the system of ODEs, showing to be a portable way of programming scientific applications but not as efficient as CUDA when running on Nvidia GPUs.

[1]  Rodrigo Weber dos Santos,et al.  Algebraic Multigrid Preconditioner for the Cardiac Bidomain Model , 2007, IEEE Transactions on Biomedical Engineering.

[2]  Joakim Sundnes,et al.  Computing the electrical activity in the heart , 2006 .

[3]  C. Luo,et al.  A model of the ventricular cardiac action potential. Depolarization, repolarization, and their interaction. , 1991, Circulation research.

[4]  R. Spiteri,et al.  A comparison of non-standard solvers for ODEs describing cellular reactions in the heart , 2007, Computer methods in biomechanics and biomedical engineering.

[5]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[6]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[7]  Rodrigo Weber dos Santos,et al.  Parallel multigrid preconditioner for the cardiac bidomain model , 2004, IEEE Transactions on Biomedical Engineering.

[8]  Rodrigo Weber dos Santos,et al.  Comparing CUDA and OpenGL implementations for a Jacobi iteration , 2009, 2009 International Conference on High Performance Computing & Simulation.

[9]  Daisuke Sato,et al.  Acceleration of cardiac tissue simulation with graphic processing units , 2009, Medical & Biological Engineering & Computing.

[10]  R. W. dos Santos,et al.  Accelerating cardiac excitation spread simulations using graphics processing units , 2011, Concurr. Comput. Pract. Exp..

[11]  Akila Gothandaraman,et al.  Comparing Hardware Accelerators in Scientific Applications: A Case Study , 2011, IEEE Transactions on Parallel and Distributed Systems.