A high effective algorithm of 32-bit multiply and MAC instructions' VLSI implementation with 32/spl times/8 multiplier-accumulator in DSP applications

Multiply and multiply-accumulate (MAC) instructions (see ARM DDI0l00E, ARM Architecture Reference Manual) are fundamental instructions in DSP applications. In an embedded digital signal processing (DSP) core and high-performance enhanced DSP instruction processor core, the implementation of high-performance multiply and MAC instructions is very important. An algorithm of 32/spl times/32 multiply and MAC instructions' VLSI implementation with 32/spl times/8 multiplier-accumulator in DSP applications is presented. The 32/spl times/32 multiplication is achieved by 4 times 32/spl times/8 multiplication. The result of one 32/spl times/8 multiplication serves as a partial product of the next 32/spl times/8 operation; when the result of four such multiplications is accumulated, we get the result of 32/spl times/32. The 32/spl times/8 multiplication is only implemented by the hardware Booth multiplier. The algorithm of multiply and MAC instructions' implementation is the better trade-off between serial multiplier and parallel multiplier.