Module to Perform Multiplication, Division, and Square Root in Systolic Arrays for Matrix Computations

A module that performs multiplication, division, and square root is presented. The implementation is compact because most of the components are shared by all three operations, the complexity being similar to a radix-2 divider. All three operations have the same execution time: one bit of the result is produced per cycle, beginning with the most significant bit. The cycle time is kept small by using redundant addition (carry-save or signed-digit) and a result-digit selection based on a low-precision estimate of the partial remainder. These properties make the module suitable for systolic arrays for matrix computations. Moreover, the communication between these modules in systolic arrays is bit-serial, yet the delay is the same as that for systems with bit-parallel communication

[1]  Henk J. Sips,et al.  Bit-Sequential Arithmetic for Parallel Processors , 1984, IEEE Transactions on Computers.

[2]  Naofumi Takagi,et al.  Design of high speed MOS multiplier and divider using redundant binary representation , 1987, 1987 IEEE 8th Symposium on Computer Arithmetic (ARITH).

[3]  R. Brent,et al.  Almost linear-time computation of the singular value decomposition using mesh-connected processors , 1983 .

[4]  Jean-Marc Delosme,et al.  Highly concurrent computing structures for matrix arithmetic and signal processing , 1982, Computer.

[5]  Tomás Lang,et al.  On-the-Fly Conversion of Redundant into Conventional Representations , 1987, IEEE Transactions on Computers.

[6]  George S. Taylor Compatible hardware for division and square root , 1981, 1981 IEEE 5th Symposium on Computer Arithmetic (ARITH).

[7]  J. Williams,et al.  A linear-time divider array , 1981, Canadian Electrical Engineering Journal.

[8]  Jan Fandrianto Algorithm for high speed shared radix 4 division and radix 4 square-root , 1987, 1987 IEEE 8th Symposium on Computer Arithmetic (ARITH).

[9]  J. B. Gosling,et al.  Design of a Hih-Speed Square Root Multiply and Divide Unit , 1987, IEEE Transactions on Computers.

[10]  Tomás Lang,et al.  Fast Multiplication Without Carry-Propagate Addition , 1990, IEEE Trans. Computers.

[11]  Tomás Lang,et al.  On-the-fly rounding for division and square root , 1989, Proceedings of 9th Symposium on Computer Arithmetic.

[12]  Jordi Cortadella,et al.  Evaluating 'A+B=K' conditions in constant time , 1988, 1988., IEEE International Symposium on Circuits and Systems.

[13]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[14]  H. Singh,et al.  A Generalized Pipeline Array , 1974, IEEE Transactions on Computers.