A High-Performance FFT Algorithm for Vector Supercomputers
暂无分享,去创建一个
Many traditional algorithms for computing the fast Fourier transform (FFT) on conventional computers are unacceptable for advanced vector and parallel computers because they involve nonunit, power-of-two memory strides. This paper presents a practical technique for computing the FFT that avoids all such strides and ap pears to be near-optimal for a variety of current vector and parallel computers. Performance results of a pro gram based on this technique are presented. Notable among these results is that a Fortran implementation of this algorithm on the CRAY-2 runs up to 77% faster than Cray's assembly-coded library routine.
[1] Paul N. Swarztrauber,et al. Multiprocessor FFTs , 1987, Parallel Comput..
[2] Paul N. Swarztrauber,et al. FFT algorithms for vector computers , 1984, Parallel Comput..
[3] Marshall C. Pease,et al. An Adaptation of the Fast Fourier Transform for Parallel Processing , 1968, JACM.
[4] B. Fornberg. A vector implementation of the Fast Fourier Transform , 1981 .
[5] Chris R. Jesshope,et al. Parallel Computers 2: Architecture, Programming and Algorithms , 1981 .