A high throughput FPGA implementation of a bit-level matrix product

This paper presents a novel architecture for a matrix product algorithm. The paper describes the mathematical model for the algorithm (based on the Baugh-Wooley algorithm), the associated design and implementation of the algorithm on a Xilinx FPGA board, and discusses the efficiency of the implementation. The architecture developed requires O(N/sup 2/) and O(2nN) and O(N) and O(2nN) as area and time complexities respectively for the matrix-matrix product and matrix-vector product, respectively (where N is the matrix size and n is the word length).

[1]  Robert Michael Owens,et al.  A VLSI chip for the winograd/Prime factor algorithm to compute the discrete Fourier transform , 1986, IEEE Trans. Acoust. Speech Signal Process..

[2]  Sun-Yuan Kung,et al.  The use of data dependence graphs in the design of bit-level systolic arrays , 1990, IEEE Trans. Acoust. Speech Signal Process..

[3]  W. Marwood,et al.  Digital signal multi-processor for matrix applications , 1999 .

[4]  S. S. Nayak,et al.  High throughput VLSI implementation of discrete orthogonal transforms using bit-level vector-matrix multiplier , 1999 .

[5]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[6]  John V. McCanny,et al.  VLSI technology and design , 1987 .

[7]  Chaitali Chakrabarti,et al.  Systolic Architectures for the Computation of the Discrete Hartley and the Discrete Cosine Transforms Based on Prime Factor Decomposition , 1990, IEEE Trans. Computers.

[8]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[9]  Scott McMillan,et al.  A re-evaluation of the practicality of floating-point operations on FPGAs , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[10]  Graham M. Megson,et al.  The systolic array genetic algorithm, an example of systolic arrays as a reconfigurable design methodology , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[11]  Paul L. Mills The design of bit parallel systolic algorithms for matrix-vector and matrix-matrix multiplication , 1985, CSC '85.