gearshifft - The FFT Benchmark Suite for Heterogeneous Platforms

Fast Fourier Transforms (FFTs) are exploited in a wide variety of fields ranging from computer science to natural sciences and engineering. With the rising data production bandwidths of modern FFT applications, judging best which algorithmic tool to apply, can be vital to any scientific endeavor. As tailored FFT implementations exist for an ever increasing variety of high performance computer hardware, choosing the best performing FFT implementation has strong implications for future hardware purchase decisions, for resources FFTs consume and for possibly decisive financial and time savings ahead of the competition. This paper therefor presents gearshifft, which is an open-source and vendor agnostic benchmark suite to process a wide variety of problem sizes and types with state-of-the-art FFT implementations (fftw, clfft and cufft). gearshifft provides a reproducible, unbiased and fair comparison on a wide variety of hardware to explore which FFT variant is best for a given problem size.

[1]  F. Del Bene,et al.  Optical Sectioning Deep Inside Live Embryos by Selective Plane Illumination Microscopy , 2004, Science.

[2]  Martin Cadík,et al.  FFT and Convolution Performance in Image Filtering on GPU , 2006, Tenth International Conference on Information Visualisation (IV'06).

[3]  Siegfried Raasch,et al.  The Parallelized Large-Eddy Simulation Model (PALM) version 4.0 for atmospheric and oceanic flows: model formulation, recent developments, and future perspectives , 2015 .

[4]  Bjarne Stroustrup,et al.  The Design and Evolution of C , 1994 .

[5]  Marc Graham Information Technology. Programming Language. The SQL Ada Module Description Language (SAMeDL). , 1995 .

[6]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[7]  Collin McCurdy,et al.  The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.

[8]  Franz Franchetti,et al.  FFTs with Near-Optimal Memory Access Through Block Data Layouts: Algorithm, Architecture and Design Automation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  L. Bluestein A linear filtering approach to the computation of discrete Fourier transform , 1970 .

[10]  Jack J. Dongarra,et al.  From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..

[11]  Jack Dongarra,et al.  HPC Challenge: Design, History, and Implementation Highlights , 2017 .

[12]  Robert S. Germain,et al.  Performance Measurements of the 3D FFT on the Blue Gene/L Supercomputer , 2005, Euro-Par.

[13]  Thomas G. Stockham,et al.  High-speed convolution and correlation , 1966, AFIPS '66 (Spring).

[14]  Stephan Preibisch,et al.  Efficient Bayesian-based multiview deconvolution , 2013, Nature Methods.

[15]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[16]  Hwa-Young Jeong,et al.  Fast Fourier transform benchmark on X86 Xeon system for multimedia data processing , 2017, Multimedia Tools and Applications.

[17]  Benjamin Schmid,et al.  Real-time multi-view deconvolution , 2015, Bioinform..

[18]  C. Loan Computational Frameworks for the Fast Fourier Transform , 1992 .

[19]  Philipp Bachmann,et al.  Static and metaprogramming patterns and static frameworks: a catalog. an application , 2006, PLoP '06.

[20]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[21]  Mohak Shah,et al.  Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning , 2015, ArXiv.

[22]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[23]  Thomas R. Hurd,et al.  A Fourier Transform Method for Spread Option Pricing , 2009, SIAM J. Financial Math..