Dynamic data layouts for cache-conscious factorization of DFT
暂无分享,去创建一个
[1] David H. Bailey. Unfavorable Strides in Cache Memory Systems (RNR Technical Report RNR-92-015) , 1995, Sci. Program..
[2] David H. Bailey. Unfavorable strides in cache memory systems , 1992 .
[3] Michael E. Wolf,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[4] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[5] Kevin R. Wadleigh,et al. High Performance FFT Algorithms for Cache-Coherent Multiprocessors , 1999, Int. J. High Perform. Comput. Appl..
[6] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[7] V. K. Prasanna,et al. Utilizing the power of high-performance computing , 1998 .
[8] Viktor K. Prasanna,et al. Fast parallel implementation of DFT using configurable devices , 1997, FPL.
[9] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .
[10] David H. Bailey,et al. FFTs in external or hierarchical memory , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[11] Margaret Martonosi,et al. Characterizing the Memory Behavior of Compiler-Parallelized Applications , 1996, IEEE Trans. Parallel Distributed Syst..
[12] Viktor K. Prasanna,et al. Parallel implementation of synthetic aperture radar on high performance computing platforms , 1997, Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing.
[13] Mithuna Thottethodi,et al. Nonlinear array layouts for hierarchical memory systems , 1999, ICS '99.