Efficient serial/parallel inner-product computation

A class of serial/parallel architectures for inner-product computation is described, based on carry-save accumulator arrays. In their basic form such arrays form carry-save multiply/adders. A simple modification of the coefficient feed allows flexible extension to short vector inner-product computation using distributed arithmetic. These modules may be cascaded to handle longer vectors, forming high-level VLSI digital signal processing subsystems.