Multithreading in the PLASMA Library

[1]  Jack J. Dongarra,et al.  Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[2]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[3]  Emmanuel Agullo,et al.  Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures , 2010, VECPAR.

[4]  R. C. Whaley,et al.  Scaling LAPACK panel operations using parallel cache assignment , 2010, PPoPP '10.

[5]  Jack J. Dongarra,et al.  Scheduling dense linear algebra operations on multicore processors , 2010, Concurr. Comput. Pract. Exp..

[6]  Emmanuel Agullo,et al.  Comparative study of one-sided factorizations with multiple software packages on multi-core hardware , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[7]  Jack Dongarra,et al.  Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .

[8]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[9]  Fred G. Gustavson,et al.  Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..

[10]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[11]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[12]  Jack Dongarra,et al.  QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .

[13]  James Demmel,et al.  LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.

[14]  Jack Dongarra,et al.  ScaLAPACK Users' Guide , 1987 .

[15]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).