A scheduling policy to save 10% of communication time in parallel fast Fourier transform

The authors thank the members of the Supercomputer Laboratory at King Abdullah University for providing the necessary resources and guidance. This research was supported by the Extreme Computing Research Center (ECRC) at KAUST and by the Simulation and Modelling Laboratory at IIT Kanpur.

[1]  Robert B. Ross,et al.  Watch Out for the Bully! Job Interference Study on Dragonfly Network , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Bilel Hadri,et al.  Scaling of a Fast Fourier Transform and a pseudo-spectral fluid solver up to 196608 cores , 2018, J. Parallel Distributed Comput..

[3]  David E. Keyes,et al.  Fast parallel multidimensional FFT using advanced MPI , 2018, J. Parallel Distributed Comput..

[4]  Paul H. J. Kelly,et al.  Is Morton layout competitive for large two‐dimensional arrays yet? , 2006, Concurr. Comput. Pract. Exp..

[5]  Stanimire Tomov,et al.  heFFTe: Highly Efficient FFT for Exascale , 2020, ICCS.

[6]  Torsten Hoefler,et al.  Efficient task placement and routing of nearest neighbor exchanges in dragonfly networks , 2014, HPDC '14.

[7]  Dmitry Pekurovsky,et al.  P3DFFT: A Framework for Parallel Computations of Fourier Transforms in Three Dimensions , 2012, SIAM J. Sci. Comput..

[8]  Daisuke Takahashi,et al.  An Implementation of Parallel 1-D Real FFT on Intel Xeon Phi Processors , 2017, ICCSA.

[9]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[10]  Valerio Pascucci,et al.  Evaluating System Parameters on a Dragonfly using Simulation and Visualization , 2015 .

[11]  Alexey Lastovetsky,et al.  Performance Optimization of Multithreaded 2D FFT on Multicore Processors: Challenges and Solution Approaches , 2018, 2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW).

[12]  Yijia Zhang,et al.  Level-Spread: A New Job Allocation Policy for Dragonfly Networks , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[13]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[14]  Cyriel Minkenberg,et al.  Performance implications of remote-only load balancing under adversarial traffic in Dragonflies , 2014, INA-OCMC '14.

[15]  John Shalf,et al.  The future of computing beyond Moore’s Law , 2020, Philosophical Transactions of the Royal Society A.