Architecture-Cognizant Divide and Conquer Algorithms
暂无分享,去创建一个
[1] Martin C. Rinard,et al. Automatic parallelization of divide and conquer algorithms , 1999, PPoPP '99.
[2] Matteo Frigo,et al. Portable high-performance programs , 1999 .
[3] Mithuna Thottethodi,et al. Tuning Strassen's Matrix Multiplication for Memory Efficiency , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[4] Sivan Toledo. Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..
[5] Steven G. Johnson,et al. The Fastest Fourier Transform in the West , 1997 .
[6] Richard E. Ladner,et al. The influence of caches on the performance of sorting , 1997, SODA '97.
[7] Steven S. Muchnick,et al. Advanced Compiler Design and Implementation , 1997 .
[8] Matteo Frigo,et al. An analysis of dag-consistent distributed shared-memory algorithms , 1996, SPAA '96.
[9] Bowen Alpern,et al. Space-limited procedures: a methodology for portable high-performance , 1995, Programming Models for Massively Parallel Computers.
[10] Olivier Temam,et al. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93. Proceedings.
[11] Bowen Alpern,et al. Rectilinear Steiner Tree Minimization on a Workstation , 1992, Computational Support for Discrete Mathematics.
[12] JAMES DEMMEL,et al. LAPACK: A portable linear algebra library for high-performance computers , 1990, Proceedings SUPERCOMPUTING '90.
[13] Murray Cole,et al. Algorithmic skeletons : a structured approach to the management of parallel computation , 1988 .