A fast parallel multiplier-accumulator using the modified Booth algorithm

This paper presents a dependence graph (DG) to visualize and describe a merged multiply-accumulate (MAC) hardware that is based on the modified Booth algorithm (MBA). The carry-save technique is used in the Booth encoder, the Booth multiplier, and the accumulator sections to ensure the fastest possible implementation. The DG applies to any MAC data word size and allows designing multiplier structures that are regular and have minimal delay, sign-bit extensions, and datapath width. Using the DG, a fast pipelined implementation is proposed, in which an accurate delay model for deep submicron CMOS technology is used. The delay model describes multi-level gate delays, taking into account input ramp and output loading. Based on the delay model, the proposed pipelined parallel MAC design is three times faster than other parallel MAC schemes that are based on the MBA. The speedup resulted from merging the accumulate and the multiply operations and the wide use of carry-save techniques.

[1]  S. Sunder,et al.  Two's-complement fast serial-parallel multiplier , 1995 .

[2]  Joseph Cavanagh,et al.  Digital Computer Arithmetic , 1983 .

[3]  Panajotis Agathoklis,et al.  New Realization and Implementation of Fixed-Point IIR Digital Filters , 1997, J. Circuits Syst. Comput..

[4]  G. Goto,et al.  A 54*54-b regularly structured tree multiplier , 1992 .

[5]  F. El-Guibaly,et al.  High-speed area-efficient inner-product processor , 1994, Canadian Journal of Electrical and Computer Engineering.

[6]  F. Elguibaly,et al.  Overflow handling in inner-product processors , 1997, 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997.

[7]  Naresh R. Shanbhag,et al.  Parallel implementation of a 4*4-bit multiplier using a modified Booth's algorithm , 1988 .

[8]  Dale J. Shpak,et al.  Design of novel serial-parallel inner-product processors , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[9]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[10]  Michael J. Flynn,et al.  Introduction to Arithmetic for Digital Systems Designers , 1995 .

[11]  Jalil Fadavi-Ardekani M×N Booth encoded multiplier generator using optimized Wallace trees , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[12]  F. El-Guibaly,et al.  Systolic implementation of linear-phase FIR filters , 1993, Proceedings of Canadian Conference on Electrical and Computer Engineering.

[13]  Shlomo Waser,et al.  High-Speed Monolithic Multipliers for Real-Time Digital Signal Processing , 1978, Computer.

[14]  O. L. Macsorley High-Speed Arithmetic in Binary Computers , 1961, Proceedings of the IRE.

[15]  Andrew D. Booth,et al.  A SIGNED BINARY MULTIPLICATION TECHNIQUE , 1951 .

[16]  F. El-Guibaly,et al.  Efficient systolic implementation of fixed-point state-space digital filter , 1993, Proceedings of Canadian Conference on Electrical and Computer Engineering.

[17]  Andreas Antoniou,et al.  VLSI array processors for linear-phase FIR filters , 1995, Canadian Journal of Electrical and Computer Engineering.

[18]  Martin H. Graham,et al.  Book Review: High-Speed Digital Design: A Handbook of Black Magic by Howard W. Johnson and Martin Graham: (Prentice-Hall, 1993) , 1993, CARN.

[19]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[20]  F. El-Guibaly,et al.  A new inner-product processor for FIR filter implementation , 1995, IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing. Proceedings.

[21]  A. R. Cooper Parallel architecture modified Booth multiplier , 1988 .

[22]  Amos R. Omondi,et al.  Computer Arithmetic Systems , 1994 .

[23]  A. R. Newton,et al.  Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas , 1990 .

[24]  Makoto Suzuki,et al.  A 4.4 ns CMOS 54/spl times/54-b multiplier using pass-transistor multiplexer , 1995 .

[25]  Fayez El Guibaly,et al.  Mapping 3-D IIR digital filter onto systolic arrays , 1996, Multidimens. Syst. Signal Process..