A High Performance Parallel FDTD Based on Winsock and Multi-Threading on a PC-Cluster

Parallel technology is a powerful tool to provide the necessary computing power and memory resources for the FDTD method to simulate electrically-large and complex structures. In this paper, a high performance parallel FDTD is developed for multi-core cluster systems. It employs Winsock to achieve efficient inter-process communication as well as multi-threading to make full use of the hardware resources of multi-core processors on a PC-cluster. Key steps for parallel FDTD such as synchronization, data exchange, load balancing, etc., are investigated. An experiment simulating the scattering of an incident electromagnetic wave form of a computer case is presented which shows that the proposed parallel FDTD achieved speedup of 25.1 and parallel efficiency of 83.7% when 10 processors with 30 cores are utilized, and outperforms traditional parallel FDTD based on MPI or MPI-OpenMP, which gained speedup of 22.9, 24.9 and parallel efficiency of 76.3%, 83.1% respectively under the same circumstances. Index Terms ─ FDTD, multi-threading, parallel computation, PC cluster, Winsock.

[1]  Raj Mittra,et al.  A robust parallel conformal finite-difference time-domain processing package using the MPI library , 2005, IEEE Antennas and Propagation Magazine.

[2]  A. Roberts Multi-Core Programming Increasing Performance through Software Multi-threading Shameem , 2006 .

[3]  G. Norton,et al.  Modeling Pulse Propagation and Scattering in a Dispersive Medium: Performance of MPI/OpenMP Hybrid Code , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[4]  Melinda Piket-May,et al.  9 – Computational Electromagnetics: The Finite-Difference Time-Domain Method , 2005 .

[5]  I. Codreanu,et al.  FDTD speedups obtained in distributed computing on a Linux workstation cluster , 2000, IEEE Antennas and Propagation Society International Symposium. Transmitting Waves of Progress to the Next Millennium. 2000 Digest. Held in conjunction with: USNC/URSI National Radio Science Meeting (C.

[6]  Guillaume Mercier,et al.  Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem , 2006, PVM/MPI.

[7]  Guillaume Mercier,et al.  Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[8]  Wenhua. Wenhua Yu ... . Yu,et al.  Parallel Finite-Difference Time-Domain Method , 2006 .

[9]  Jean-Pierre Berenger,et al.  A perfectly matched layer for the absorption of electromagnetic waves , 1994 .

[10]  Franck Cappello,et al.  MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[11]  David A. Bader,et al.  A novel FDTD application featuring OpenMP-MPI hybrid parallelization , 2004 .

[12]  A. Rane,et al.  Experiences in Tuning Performance of Hybrid MPI / OpenMP Applications on Quad-core Systems , 2009 .

[13]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[14]  Raj Mittra,et al.  Finite-difference time-domain (FDTD) analysis using distributed computing , 1994, IEEE Microwave and Guided Wave Letters.

[15]  Barbara Chapman,et al.  Using OpenMP - portable shared memory parallel programming , 2007, Scientific and engineering computation.

[16]  K. Mahdjoubi,et al.  A parallel FDTD algorithm using the MPI library , 2001 .

[17]  Anthony Skjellum,et al.  Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.

[18]  K. Yee Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media , 1966 .

[19]  D.N. de Araujo,et al.  Massively Parallel Conformal FDTD on a BlueGene Supercomputer , 2007, IEEE Transactions on Advanced Packaging.