Implementation and Evaluation of Parallel FFT Using SIMD Instructions on Multi-core Processors

In this paper, an implementation of a parallel two- dimensional fast Fourier transform (FFT) using short vector SIMD instructions on multi-core processors is proposed. Combination of vectorization and the block two- dimensional FFT algorithm is shown to effectively improve performance. We vectorized FFT kernels using Intel's streaming SIMD extensions 3 (SSE3) instruction. The performance results for two-dimensional FFTs on multi-core processors are reported. We succeeded in obtaining a performance of over 2.7 GFLOPS on a dual-core Intel Xeon (2.8 GHz, two CPUs, four cores) and over 3.3 GFLOPS on an Intel Core2 Duo E6600 (2.4 GHz, one CPU, two cores) for a 210 times 210 -point FFT.

[1]  Franz Franchetti,et al.  Architecture independent short vector FFTs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  V. Paul Rodriguez A radix-2 FFT algorithm for Modern Single Instruction Multiple Data (SIMD) architectures , 2002 .

[3]  Franz Franchetti,et al.  SIMD Vectorization of Straight Line FFT Code , 2003, Euro-Par.

[4]  David H. Bailey,et al.  FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[5]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[6]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .

[7]  Paul N. Swarztrauber,et al.  FFT algorithms for vector computers , 1984, Parallel Comput..

[8]  Franz Franchetti,et al.  Efficient Utilization of SIMD Extensions , 2005, Proceedings of the IEEE.

[9]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[10]  Takashi Miyazaki,et al.  Radix-4 FFT implementation using SIMD multimedia instructions , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).