Scalable parallel FFT for spectral simulations on a Beowulf cluster

The implementation and performance of the multidimensional Fast Fourier Transform (FFT) on a distributed memory Beowulf cluster is examined. We focus on the three-dimensional (3D) real transform, an essential computational component of Galerkin and pseudo-spectral codes. The approach studied is a 1D domain decomposition algorithm that relies on communication-intensive transpose operation involving P processors. Communication is based upon the standard portable message passing interface (MPI). We show that 1/P scaling for execution time at fixed problem size N3 (i.e., linear speedup) can be obtained provided that (1) the transpose algorithm is optimized for simultaneous block communication by all processors; and (2) communication is arranged for non-overlapping pairwise communication between processors, thus eliminating blocking when standard fast ethernet interconnects are employed. This method provides the basis for implementation of scalable and efficient spectral method computations of hydrodynamic and magneto-hydrodynamic turbulence on Beowulf clusters assembled from standard commodity components. An example is presented using a 3D passive scalar code.

[1]  Vipin Kumar,et al.  The Scalability of FFT on Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..

[2]  Martinez,et al.  Selective decay and coherent vortices in two-dimensional incompressible turbulence. , 1991, Physical review letters.

[3]  C. Temperton Self-sorting mixed-radix fast Fourier transforms , 1983 .

[4]  R. Pelz The parallel Fourier pseudospectral method , 1991 .

[5]  Shiyi Chen,et al.  High‐resolution turbulent simulations using the Connection Machine‐2 , 1992 .

[6]  P. Fischer,et al.  PARALLEL SIMULATION OF VISCOUS INCOMPRESSIBLE FLOWS , 1994 .

[7]  G. C. Fox,et al.  Parallel Multigrid Computation of the Unsteady Incompressible Navier-Stokes Equations , 1996 .

[8]  Anthony Skjellum,et al.  Using MPI - portable parallel programming with the message-parsing interface , 1994 .

[9]  P. Moin,et al.  Numerical investigation of turbulent channel flow , 1981, Journal of Fluid Mechanics.

[10]  Mohammed Atiquzzaman,et al.  Parallel computing on clusters of workstations , 2000, Parallel Comput..

[11]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[12]  G. S. Patterson,et al.  Spectral Calculations of Isotropic Turbulence: Efficient Removal of Aliasing Interactions , 1971 .

[13]  T. A. Zang,et al.  Spectral methods for fluid dynamics , 1987 .

[14]  P. Moin,et al.  Numerical Simulation of Turbulent Flows , 1984 .

[15]  David S. Greenberg,et al.  Massively parallel computing using commodity components , 2000, Parallel Comput..

[16]  Thomas Sterling,et al.  How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters 2nd Printing , 1999 .

[17]  Amir Averbuch,et al.  Portable parallel FFT for MIMD multiprocessors , 1998 .

[18]  Shiyi Chen,et al.  Examination of hypotheses in the Kolmogorov refined turbulence theory through high-resolution simulations. Part 1. Velocity field , 1996, Journal of Fluid Mechanics.

[19]  Marc Brachet,et al.  The dynamics of freely decaying two-dimensional turbulence , 1988, Journal of Fluid Mechanics.

[20]  D. Gottlieb,et al.  Numerical analysis of spectral methods : theory and applications , 1977 .