论文信息 - Parallelization of FDM/FEM computation for PDEs on PARAM YUVA-II cluster of Xeon Phi coprocessors

Parallelization of FDM/FEM computation for PDEs on PARAM YUVA-II cluster of Xeon Phi coprocessors

This paper discusses an efficient implementation of finite difference method (FDM) and finite element method (FEM) computations for Partial Differential Equation (Poisson Equation) on a message passing cluster with Intel Xeon Phi coprocessors[6,15]. We have performed computations on PARAM YUVA-II [9] which is a message passing cluster with compute nodes as Xeon multi-core processors and Xeon Phi coprocessors [6,15,17-19]. A combination of OpenMP [4] and MPI [5,19,20] is used for structured grid FDM computations. The unstructured triangular and hexahedral meshes and graph partitioning software METIS [10] are used in FEM computations. The Jacobi iterative method is used to solve resulting matrix system of linear equations. A detailed performance analysis of optimizations on Xeon Phi coprocessor using OpenMP and MPI framework are presented. Our experiments indicate that MPI-OpenMP codes on FDM computations achieve 2X to 3X speed-ups for large mesh sizes. The FEM implementation has shown marginal improvement in speed-up on Xeon Phi Cluster.

Sonia Rani | Samrit Kumar Maity | Krishan Gopal Gupta | Vudutala China V. Rao

[1] Chaman Singh Verma,et al. Parallelization of finite volume computations for heat transfer application using unstructured mesh partitioning algorithms , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[2] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[3] William Gropp,et al. MPI: The Complete Reference , Vol. 2 - The MPI-2 Extensions , 1998 .

[4] Sami Saarinen,et al. Best Practice Guide – Intel Xeon Phi , 2013 .

[5] Paulius Micikevicius,et al. 3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.

[6] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[7] Timothy J. Tautges,et al. Jaal: Engineering a High Quality All-Quadrilateral Mesh Generator , 2011, IMR.

[8] John E. Stone,et al. GPU clusters for high-performance computing , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[9] Robert Strzodka,et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..