SUMMA: Scalable Universal Matrix Multiplication Algorithm
暂无分享,去创建一个
In this paper, we give a straight forward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance results on the Intel Paragon system.