Parallel Optimisation Strategies for Fusion Codes

We have previously documented the on-going work in the EUFORIA project to parallelise and optimise European fusion simulation codes. This involves working with a wide range of codes to try and address any performance and scaling issues that these codes have. However, as no two simulation codes are exactly the same, it is very hard to apply exactly the same approach to optimising a disparate range of codes. Indeed, the codes investigated range in terms of performance and ability from well-optimised, highly parallelised codes, to serial or poorly performing codes. After analysing, optimising, and parallelising a range of codes it is, actually, possible to discern a number of distinct optimisation techniques or approaches/strategies that can be used to improve the performance or scaling of a parallel simulation code. This paper outlines the distinct approaches that we have identified, highlighting their benefits and drawbacks, giving an overview of the type of work that is often attempted for fusion simulation code optimisation. performing codes. After analysing, optimising, parallelising, and scaling a range of codes it is, actually, possible to discern a number of distinctoptimisation techniques or approaches/strategies that can be used to improve the performance or scaling of a parallel simulation code. This paper outlines the distinct approaches that we have identified, highlighting their benefits and drawbacks, giving an overview of the type of work that is often attempted for fusion simulation code optimisation.

[1]  Jack J. Dongarra,et al.  Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor , 2009, Parallel Comput..

[2]  Julien Langou,et al.  A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..

[3]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[4]  A. Arakawa Computational design for long-term numerical integration of the equations of fluid motion: two-dimen , 1997 .

[5]  Jan Westerholm,et al.  SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion , 2011, 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[6]  Adam Carter,et al.  EUFORIA HPC: Massive Parallelisation for Fusion Community , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[7]  Samuel Williams,et al.  Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[8]  Daniel Jiménez-González,et al.  Performance Analysis of Cell Broadband Engine for High Memory Bandwidth Applications , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.