Parallelization of the Multilevel Fast Multipole Algorithm by Combined Use of OpenMP and VALU Hardware Acceleration

A parallel scheme that combines the OpenMP and the vector arithmetic logic unit (VALU) hardware acceleration is presented to speed up the multilevel fast multipole algorithm (MLFMA) on shared-memory computers. Performance of the hybrid parallel OpenMP-VALU MLFMA is investigated and several strategies are employed to improve the overall speedup and parallel efficiency. Effectiveness of the hybrid parallel scheme is verified by numerical results of the electromagnetic (EM) scattering examples, and related numerical stability issue is discussed as well.

[1]  Raj Mittra,et al.  New Development of Parallel Conformal FDTD Method in Computational Electromagnetics Engineering , 2011, IEEE Antennas and Propagation Magazine.

[2]  Xiaoling Yang,et al.  Performance of Streaming SIMD Extensions Instructions for the FDTD Computation , 2012, IEEE Antennas and Propagation Magazine.

[3]  Roland W. Freund,et al.  A Transpose-Free Quasi-Minimal Residual Algorithm for Non-Hermitian Linear Systems , 1993, SIAM J. Sci. Comput..

[4]  V. Okhmatovski,et al.  Low-Frequency MLFMA on Graphics Processors , 2010, IEEE Antennas and Wireless Propagation Letters.

[5]  Luis Landesa,et al.  MLFMA-FFT PARALLEL ALGORITHM FOR THE SO- LUTION OF LARGE-SCALE PROBLEMS IN ELECTRO- MAGNETICS (INVITED PAPER) , 2010 .

[6]  Kan Xu,et al.  Multilevel fast multipole algorithm enhanced by GPU parallel technique for electromagnetic scattering problems , 2010 .

[7]  L. Gurel,et al.  A Hierarchical Partitioning Strategy for an Efficient Parallelization of the Multilevel Fast Multipole Algorithm , 2009, IEEE Transactions on Antennas and Propagation.

[8]  Xin-Qing Sheng,et al.  On Openmp Parallelization of the Multilevel Fast Multipole Algorithm , 2011 .

[9]  L. Gurel,et al.  Rigorous Solutions of Electromagnetic Problems Involving Hundreds of Millions of Unknowns , 2011, IEEE Antennas and Propagation Magazine.

[10]  J. Fostier,et al.  An Asynchronous Parallel MLFMA for Scattering at Multiple Dielectric Objects , 2008, IEEE Transactions on Antennas and Propagation.

[11]  Yunquan Zhang,et al.  Performance Evaluation of Multithreaded Sparse Matrix-Vector Multiplication Using OpenMP , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[12]  D. Wilton,et al.  Electromagnetic scattering by surfaces of arbitrary shape , 1980 .

[13]  S. Velamparambil,et al.  10 million unknowns: is it that big? [computational electromagnetics] , 2003, IEEE Antennas and Propagation Magazine.