Four Easy Ways to a Faster FFT
暂无分享,去创建一个
[1] Steve Karmesin,et al. Array Design and Expression Evaluation in POOMA II , 1998, ISCOPE.
[2] Sven-Bodo Scholz,et al. On Programming Scientific Applications in SAC - A Functional Language Extended by a Subsystem for High-Level Array Operations , 1996, Implementation of Functional Languages.
[3] Bradford L. Chamberlain,et al. The case for high-level parallel programming in ZPL , 1998 .
[4] Harry B. Hunt,et al. On Materializations of Array-Valued Temporaries , 2000, LCPC.
[5] C. Loan. Computational Frameworks for the Fast Fourier Transform , 1992 .
[6] Todd L. Veldhuizen,et al. Arrays in Blitz++ , 1998, ISCOPE.
[7] N. Ahmed,et al. FAST TRANSFORMS, algorithms, analysis, applications , 1983, Proceedings of the IEEE.
[8] Lawrence Snyder,et al. ZPL: An Array Sublanguage , 1993, LCPC.
[9] Todd L. Veldhuizen,et al. Expression templates , 1996 .
[10] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[11] R. Tolimieri,et al. Algorithms for Discrete Fourier Transform and Convolution , 1989 .
[12] Sadayappan,et al. EXTENT : A Portable Programming and Implementing High-Performance , 1997 .
[13] Chao Lu,et al. Mathematics of Multidimensional Fourier Transform Algorithms , 1993 .
[14] Ramesh C. Agarwal,et al. A high performance parallel algorithm for 1-D FFT , 1994, Proceedings of Supercomputing '94.
[15] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[16] Sandeep K. S. Gupta,et al. On the Synthesis of Parallel Programs from Tensor Product Formulas for Block Recursive Algorithms , 1992, LCPC.
[17] Sandeep K. S. Gupta,et al. Implementing Fast Fourier Transforms on Distributed-Memory Multiprocessors Using Data Redistributions , 1994, Parallel Process. Lett..
[18] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[19] T. Forshaw. Everything you always wanted to know , 1977 .
[20] Anthony Skjellum,et al. Driving Issues in Scalable Libraries: Poly-Algorithms, Data Distribution Independence, Redistribution, Local Storage Schemes , 1995, PPSC.
[21] Vipin Kumar,et al. The Scalability of FFT on Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..
[22] Andrew Lumsdaine,et al. Parallel Extensions to the Matrix Template Library , 1997, PPSC.
[23] Bradford L. Chamberlain,et al. Factor-Join: A Unique Approach to Compiling Array Languages for Parallel Machines , 1996, LCPC.
[24] Lenore M. Restifo Mullin,et al. Formal method for scheduling, routing and communication protocol , 1993, [1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing.
[25] D. Miles. Compute intensity and the FFT , 1993, Supercomputing '93.
[26] Sandeep K. S. Gupta,et al. A Framework for Generating Distributed-Memory Parallel Programs for Block Recursive Algorithms , 1986, J. Parallel Distributed Comput..
[27] Todd L. Veldhuizen,et al. Using C++ template metaprograms , 1996 .
[28] Michael Conner,et al. Recursive fast algorithm and the role of the tensor product , 1992, IEEE Trans. Signal Process..
[29] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[30] Anthony Skjellum,et al. A poly‐algorithm for parallel dense matrix multiplication on two‐dimensional process grid topologies , 1997 .
[31] Steve Karmesin,et al. Optimization of Data-Parallel Field Expressions in the POOMA Framework , 1997, ISCOPE.
[32] Jeremy G. Siek,et al. The Matrix Template Library: A Generic Programming Approach to High Performance Numerical Linear Algebra , 1998, ISCOPE.
[33] Bradford L. Chamberlain,et al. A Compiler Abstraction for Machine Independent Parallel Communication Generation , 1997, LCPC.
[34] L. Mullin. A mathematics of arrays , 1988 .