A new approach for automatic parallelization of blocked linear Algebra computations

No abstract available

[1]  H. T. Kung,et al.  The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.

[2]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[3]  H. T. Kung,et al.  Supporting systolic and memory communication in iWarp , 1990, ISCA '90.

[4]  Hudson Benedito Ribas Automatic generation of systolic programs from nested loops , 1990 .

[5]  R. Adams Proceedings , 1947 .

[6]  Ken Kennedy,et al.  Computer support for machine-independent parallel programming in Fortran D , 1992 .

[7]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[8]  P.-S. Tseng,et al.  A parallelizing compiler for distributed memory parallel computers , 1989, PLDI 1989.

[9]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[10]  Jack Dongarra,et al.  LAPACK Working Note 24: LAPACK Block Factorization Algorithms on the INtel iPSC/860 , 1990 .

[11]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[12]  William Jalby,et al.  Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .

[13]  H. T. Kung,et al.  Systolic Arrays for (VLSI). , 1978 .

[14]  H. T. Kung Why systolic architectures? , 1982, Computer.

[15]  Shekhar Y. Borkar,et al.  iWarp: an integrated solution to high-speed parallel computing , 1988, Proceedings. SUPERCOMPUTING '88.

[16]  Michel J. Daydé,et al.  Use of parallel level 3 BLAS in LU factorization on three vector multiprocessors the ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF , 1990, ICS '90.

[17]  Michel J. Daydé,et al.  Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/Vf , 1989, Int. J. High Perform. Comput. Appl..