Implementation of Parallel 1-D FFT on GPU Clusters
暂无分享,去创建一个
[1] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .
[2] David H. Bailey,et al. FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[3] C. Loan. Computational Frameworks for the Fast Fourier Transform , 1992 .
[4] M. Hegland. A self-sorting in-place fast Fourier transform algorithm suitable for vector and parallel processing , 1994 .
[5] Ramesh C. Agarwal,et al. A high performance parallel algorithm for 1-D FFT , 1994, Proceedings of Supercomputing '94.
[6] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[7] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[8] Franz Franchetti,et al. Automatic Performance Optimization of the Discrete Fourier Transform on Distributed Memory Computers , 2006, ISPA.
[9] Satoshi Matsuoka,et al. Auto-tuning 3-D FFT library for CUDA GPUs , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[10] Yasushi Negishi,et al. Overlapping Methods of All-to-All Communication and FFT Algorithms for Torus-Connected Massively Parallel Supercomputers , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] Yifeng Chen,et al. Large-scale FFT on GPU clusters , 2010, ICS '10.
[12] Sayantan Sur,et al. MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters , 2011, Computer Science - Research and Development.
[13] Ping Tak Peter Tang,et al. A framework for low-communication 1-D FFT , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[14] Daisuke Takahashi,et al. An Implementation of Parallel 1-D FFT on the K Computer , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.
[15] Satoshi Matsuoka,et al. Scalable multi-GPU 3-D FFT for TSUBAME 2.0 Supercomputer , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.