A new approach for automatic parallelization of blocked linear Algebra computations
暂无分享,去创建一个
[1] H. T. Kung,et al. The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.
[2] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[3] H. T. Kung,et al. Supporting systolic and memory communication in iWarp , 1990, ISCA '90.
[4] Hudson Benedito Ribas. Automatic generation of systolic programs from nested loops , 1990 .
[5] R. Adams. Proceedings , 1947 .
[6] Ken Kennedy,et al. Computer support for machine-independent parallel programming in Fortran D , 1992 .
[7] Jack J. Dongarra,et al. An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.
[8] P.-S. Tseng,et al. A parallelizing compiler for distributed memory parallel computers , 1989, PLDI 1989.
[9] H. T. Kung,et al. Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.
[10] Jack Dongarra,et al. LAPACK Working Note 24: LAPACK Block Factorization Algorithms on the INtel iPSC/860 , 1990 .
[11] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[12] William Jalby,et al. Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .
[13] H. T. Kung,et al. Systolic Arrays for (VLSI). , 1978 .
[14] H. T. Kung. Why systolic architectures? , 1982, Computer.
[15] Shekhar Y. Borkar,et al. iWarp: an integrated solution to high-speed parallel computing , 1988, Proceedings. SUPERCOMPUTING '88.
[16] Michel J. Daydé,et al. Use of parallel level 3 BLAS in LU factorization on three vector multiprocessors the ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF , 1990, ICS '90.
[17] Michel J. Daydé,et al. Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/Vf , 1989, Int. J. High Perform. Comput. Appl..