Architecture-Cognizant Divide and Conquer Algorithms
暂无分享,去创建一个
[1] Bowen Alpern,et al. Space-limited procedures: a methodology for portable high-performance , 1995, Programming Models for Massively Parallel Computers.
[2] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .
[3] Matteo Frigo,et al. An analysis of dag-consistent distributed shared-memory algorithms , 1996, SPAA '96.
[4] W. Jalby,et al. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.
[5] Jack Dongarra,et al. LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.
[6] Mithuna Thottethodi,et al. Tuning Strassen's Matrix Multiplication for Memory Efficiency , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[7] Steven G. Johnson,et al. The Fastest Fourier Transform in the West , 1997 .
[8] Bowen Alpern,et al. Rectilinear Steiner Tree Minimization on a Workstation , 1992, Computational Support for Discrete Mathematics.
[9] Matteo Frigo,et al. Portable high-performance programs , 1999 .
[10] Murray Cole,et al. Algorithmic skeletons : a structured approach to the management of parallel computation , 1988 .
[11] Sivan Toledo. Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..
[12] Bernd Freisleben,et al. Automatic Parallelization of Divide-and-Conquer Algorithms , 1992, CONPAR.
[13] Martin C. Rinard,et al. Automatic parallelization of divide and conquer algorithms , 1999, PPoPP '99.
[14] Richard E. Ladner,et al. The influence of caches on the performance of sorting , 1997, SODA '97.