论文信息 - A novel parallel FDTD algorithm on Non-Uniform Memory Access multiprocessors

A novel parallel FDTD algorithm on Non-Uniform Memory Access multiprocessors

It is critical to Choose a good threads and data distribution scheme to the performance of data-parallel applications on Non-Uniform Memory Access (NUMA) architecture workstation. In this paper, we introduce a novel parallel finite-difference time-domain (FDTD) algorithm by optimize application threads affinity on NUMA architecture workstation. The algorithm has achieved the excellent performance through an ideal test case and an inverted-F antenna example.

Xiaomei Guo

[1] Collin McCurdy,et al. Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[2] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..

[3] M. F. Pantoja,et al. Efficient excitation of waveguides in Crank-Nicolson FDTD , 2010 .

[4] Sverker Holmgren,et al. affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system , 2005, ICS '05.

[5] L. Cristoforetti,et al. Parallel Implementation of a 3D Subgridding FDTD Algorithm for Large Simulations , 2011 .

[6] Jean-François Méhaut,et al. Improving Memory Affinity of Geophysics Applications on NUMA Platforms Using Minas , 2010, VECPAR.

[7] X. Ai,et al. ANALYSIS OF DISPERSION RELATION OF PIECEWISE LINEAR RECURSIVE CONVOLUTION FDTD METHOD FOR SPACE-VARYING PLASMA , 2011 .

[8] Yu Zhang,et al. EMC Analysis of Antennas Mounted on Electrically Large Platforms with Parallel FDTD Method , 2008 .

[9] Stephen A. Jarvis,et al. High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation , 2013, Lecture Notes in Computer Science.

[10] John M. Mellor-Crummey,et al. A tool to analyze the performance of multithreaded programs on NUMA architectures , 2014, PPoPP '14.

[11] Gustavo Alonso,et al. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[12] Kenneth A. Ross,et al. Scalable aggregation on multicore processors , 2011, DaMoN '11.

[13] Vivien Quéma,et al. Traffic management: a holistic approach to memory placement on NUMA systems , 2013, ASPLOS '13.

[14] Chang-Hong Liang,et al. STUDY ON SHIELDING EFFECTIVENESS OF METALLIC CAVITIES WITH APERTURES BY COMBINING PARALLEL FDTD METHOD WITH WINDOWING TECHNIQUE , 2007 .