FDTD speedups obtained in distributed computing on a Linux workstation cluster

The project investigated various aspects of parallel FDTD implementation on a workstation cluster. The computation grid was divided among nodes. For a fixed size problem, as the number of processor increases, the speedup saturates. This happens because each processor spends less time computing but essentially the same time communicating with its neighbors. To take advantage of the parallel algorithm, the problem size must be sufficiently large compared with the number of processors. For very large problems, we can efficiently employ a large number of processors to obtain a linear speedup. In this work, the message passing interface (MPI) parallel implementation was integrated with POSIX threads using the pthreads library. This was required because each node in the cluster was equipped with two processors. On each node, each process contained two threads that executed in parallel. As expected, for sufficiently large problems the speedup was increased by almost a factor of two when using threads.