A self-sorting in-place fast Fourier transform algorithm suitable for vector and parallel processing

Summary. We propose a new algorithm for fast Fourier transforms. This algorithm features uniformly long vector lengths and stride one data access. Thus it is well adapted to modern vector computers like the Fujitsu VP2200 having several floating point pipelines per CPU and very fast stride one data access. It also has favorable properties for distributed memory computers as all communication is gathered together in one step. The algorithm has been implemented on the Fujitsu VP2200 using the basic subroutines for fast Fourier transforms discussed elsewhere. We develop the theory of index digit permutations to some extent. With this theory we can derive the splitting formulas for almost all mixed-radix FFT algorithms known so far. This framework enables us to prove these algorithms but also to derive our new algorithm. The development and systematic use of this framework is new and allows us to simplify the proofs which are now reduced to the application of matrix recursions.

[1]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[2]  W. M. Gentleman,et al.  Fast Fourier Transforms: for fun and profit , 1966, AFIPS '66 (Fall).

[3]  Peter D. Welch,et al.  Fast Fourier Transform , 2011, Starting Digital Signal Processing in Telecommunication Engineering.

[4]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[5]  Marshall C. Pease,et al.  An Adaptation of the Fast Fourier Transform for Parallel Processing , 1968, JACM.

[6]  Donald Fraser,et al.  Array Permutation by Index-Digit Permutation , 1976, JACM.

[7]  E. O. Brigham,et al.  The Fast Fourier Transform , 1967, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  W. P. Petersen,et al.  Vector Fortran for numerical problems on CRAY-1 , 1983, CACM.

[9]  C. Temperton Self-sorting mixed-radix fast Fourier transforms , 1983 .

[10]  P. Duhamel,et al.  `Split radix' FFT algorithm , 1984 .

[11]  David H. Bailey A High-Performance FFT Algorithm for Vector Supercomputers , 1987, PPSC.

[12]  Andrew G. Lyne,et al.  A segmented FFT algorithm for vector computers , 1988, Parallel Comput..

[13]  Martin Vetterli,et al.  Fast Fourier transforms: a tutorial review and a state of the art , 1990 .

[14]  Clive Temperton Self-Sorting In-Place Fast Fourier Transforms , 1991, SIAM J. Sci. Comput..

[15]  Markus Hegland,et al.  On the parallel solution of tridiagonal systems by wrap-around partitioning and incomplete LU factorization , 1991 .

[16]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .