A New Parallel Matrix Multiplication Algorithm for Wormhole-Routed All-Port 2D/3D Torus Networks

A new matrix multiplication algorithm is proposed for massively parallel supercomputers with 2D/3D, all-port torus interconnection networks. The proposed algorithm is based on the traditional row-by-column multiplication matrix product model and employs a special routing pattern for better scalability. It compares favorably to the variants of Cannon’s and DNS algorithms since it allows matrices of the same size to be multiplied on a higher number of processors due to lower data communications overhead.