A Blocking Algorithm for FFT on Cache-Based Processors
暂无分享,去创建一个
In this paper, we propose a blocking algorithm for computing large one-dimensional fast Fourier transform (FFT) on cache-based processors. Our proposed FFT algorithm is based on the six-step FFT algorithm. We show that the block six-step FFT algorithm improves performance by effectively utilizing the cache memory. Performance results of one-dimensional FFTs on the Sun Ultra 10 and PentiumIII PC are reported. We succeeded in obtaining performance of about 108MFLOPS on the Sun Ultra 10 (UltraSPARC-IIi 333MHz) and about 247MFLOPS on the 1GHz PentiumIII PC for 220-point FFT.
[1] David H. Bailey,et al. FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[2] C. Loan. Computational Frameworks for the Fast Fourier Transform , 1992 .
[3] Steven G. Johnson,et al. The Fastest Fourier Transform in the West , 1997 .
[4] Kevin R. Wadleigh,et al. High Performance FFT Algorithms for Cache-Coherent Multiprocessors , 1999, Int. J. High Perform. Comput. Appl..