FFT algorithms and their adaptation to parallel processing

Abstract The development of the fast Fourier transform (FFT) and its numerous variants in the past 30 years has led to very efficient software and hardware implementations of the transform on uniprocessor computers. In recent years, many researchers have recognized the practical importance of minimizing computing time by parallelizing sequential FFT algorithms in various ways for today's high-performance multiprocessor computers. This paper presents many FFT variants already proposed by others in a common framework, which illuminates the progress made in parallelizing them to this date. In addition, three new parallel FFT algorithms along with communication complexity results are presented. The proposed algorithms show alternative ways of designing parallel FFT algorithms which feature reduced communication cost and further flexibility in the choices of data mappings.

[1]  Mohammad Zubair,et al.  A General Purpose Subroutine for Fast Fourier Transform on a Distributed Memory Parallel Machine , 1994, Parallel Comput..

[2]  Clive Temperton Implementation of a self-sorting in-place prime factor FFT algorithm , 1985 .

[3]  S. R. Seidel,et al.  Concurrent Bidirectional Communication On The Intel iPSC/860 And iPSC/2 , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[4]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[5]  Howard Jay Siegel,et al.  FFT Algorithms for SIMD Parallel Processing Systems , 1986, J. Parallel Distributed Comput..

[6]  C. Sidney Burrus,et al.  An in-order, in-place radix-2 FFT , 1984, ICASSP.

[7]  Alan H. Karp Bit Reversal on Uniprocessors , 1996, SIAM Rev..

[8]  Paul N. Swarztrauber,et al.  Ordered Fast Fourier Transforms on a Massively Parallel Hypercube Multiprocessor , 1991, J. Parallel Distributed Comput..

[9]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .

[10]  S. Lennart Johnsson,et al.  Cooley-Tukey FFT on the Connection Machine , 1992, Parallel Comput..

[11]  Joseph H. Rothweiler Implementation of the in-order prime factor transform for variable sizes , 1982 .

[12]  Alfonso Farina,et al.  Mapping the Synthetic Aperture Radar Signal Processor on a Distributed-Memory MIMD Architecture , 1996, Parallel Comput..

[13]  Peter M. Flanders A Unified Approach to a Class of Data Movements on an Array Processor , 1982, IEEE Transactions on Computers.

[14]  Donald Fraser,et al.  Array Permutation by Index-Digit Permutation , 1976, JACM.

[15]  R. M. Chamberlain,et al.  Gray codes, Fast Fourier Transforms and hypercubes , 1988, Parallel Comput..

[16]  C. Burrus,et al.  An in-place, in-order prime factor FFT algorithm , 1981 .

[17]  Clive Temperton Self-Sorting In-Place Fast Fourier Transforms , 1991, SIAM J. Sci. Comput..