Scalable Parallelization Strategies to Accelerate NuFFT Data Translation on Multicores

The non-uniform FFT (NuFFT) has been widely used in many applications. In this paper, we propose two new scalable parallelization strategies to accelerate the data translation step of the NuFFT on multicore machines. Both schemes employ geometric tiling and binning to exploit data locality, and use recursive partitioning and scheduling with dynamic task allocation to achieve load balancing. The experimental results collected from a commercial multicore machine show that, with the help of our parallelization strategies, the data translation step is no longer the bottleneck in the NuFFT computation, even for large data set sizes, with any input sample distribution.

[1]  Vladimir Rokhlin,et al.  Fast Fourier Transforms for Nonequispaced Data , 1993, SIAM J. Sci. Comput..

[2]  Mahmut T. Kandemir,et al.  Exploring parallelization strategies for NUFFT data translation , 2009, EMSOFT '09.

[3]  Q. Liu,et al.  An accurate algorithm for nonuniform fast Fourier transforms (NUFFT's) , 1998 .

[4]  Matteo Frigo A Fast Fourier Transform Compiler , 1999, PLDI.

[5]  J. Kuo,et al.  Application of two-dimensional nonuniform fast Fourier transform (2-D NUFFT) technique to analysis of shielded microstrip circuits , 2005, IEEE Transactions on Microwave Theory and Techniques.

[6]  Leslie Greengard,et al.  Accelerating the Nonuniform Fast Fourier Transform , 2004, SIAM Rev..

[7]  Sanjay V. Rajopadhye,et al.  A Geometric Programming Framework for Optimal Multi-Level Tiling , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[8]  G. Beylkin On the Fast Fourier Transform of Functions with Singularities , 1995 .

[9]  Stefan Kunis,et al.  A Note on the Iterative MRI Reconstruction from Nonuniform k-Space Data , 2007, Int. J. Biomed. Imaging.

[10]  Tobias Schaeffter,et al.  Accelerating the Nonequispaced Fast Fourier Transform on Commodity Graphics Hardware , 2008, IEEE Transactions on Medical Imaging.

[11]  Qing Huo Liu,et al.  Iterative algorithm for nonuniform inverse fast Fourier transform (NU-IFFT) , 1998 .

[12]  Jeffrey A. Fessler,et al.  Nonuniform fast Fourier transforms using min-max interpolation , 2003, IEEE Trans. Signal Process..

[13]  A. Duijndam,et al.  Nonuniform fast Fourier transform , 1999 .

[14]  J.M. Lopez-Sanchez,et al.  An approach to SAR imaging by means of non-uniform FFTs , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[15]  Xiaobai Sun,et al.  Accelerating nonuniform fast Fourier transform via reduction in memory access latency , 2008, Optical Engineering + Applications.

[16]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[17]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[18]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[19]  Mahmut T. Kandemir,et al.  Geometric Tiling for Reducing Power Consumption in Structured Matrix Operations , 2006, 2006 IEEE International SOC Conference.