Mapping full‐systolic arrays for matrix product on XILINX's XC4000(E,EX) FPGAs

Matrix product is a compute bound problem that can be efficiently handled by elementary systolic algorithms. From a theoretical point of view, most of the algorithms are very simple and sometimes even trivial. However, the task of designing efficient implementation on a fixed‐connection network, such as on FPGA where resources are very limited, has been more demanding, and sometimes quite tedious. The objective of this paper is twofold: we first describe a full‐systolic algorithm for matrix product that has the merit over its existing counterparts, to require no preloading of input data into elementary processors (EPs) and generates output data only from boundary EPs. The resulting architecture can accept an uninterrupted stream of input data and produces an uninterrupted one with a latency of 2N‐1 for N×N matrix product. This architecture is also scalable and complies with the constraint of problem‐size independence (ψ). Secondly, we present a methodology for generating a family of very compact MP arrays on FPGA based essentially upon manual mapping at CLB level coupled with VHDL structural level.

[1]  Jong-Chuang Tsay,et al.  Design of Efficient Regular Arrays for Matrix Multiplication by Two-Step Regularization , 1995, IEEE Trans. Parallel Distributed Syst..

[2]  Sun-Yuan Kung VLSI Array Processor for Signal Processing. , 1982 .

[3]  Jong-Chuang Tsay,et al.  Some New Designs of 2-D Array for Matrix Multiplication and Transitive Closure , 1995, IEEE Trans. Parallel Distributed Syst..

[4]  Earl E. Swartzlander Application Specific Processors , 1997 .

[5]  H. T. Kung,et al.  Systolic Arrays for (VLSI). , 1978 .

[6]  Thomas Kailath,et al.  A Family of New Efficient Arrays for Matrix Multiplication , 1989, IEEE Trans. Computers.

[7]  Yves Robert,et al.  An even faster systolic array for matrix multiplication , 1989, Parallel Comput..

[8]  Benjamin W. Wah,et al.  The Design of Optimal Systolic Arrays , 1985, IEEE Transactions on Computers.

[9]  Oussama Khatib,et al.  The explicit dynamic model and inertial parameters of the PUMA 560 arm , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[10]  Subhash C. Kak,et al.  A two-layered mesh array for matrix multiplication , 1988, Parallel Comput..