A novel power-efficient multi-operand digit-multiplier using reconfiguration and clock gating

Digit serial–serial multipliers are one approach to power-optimize multiplication where operands are fed one digit at a time. This significantly reduces the required chip area and hence reducing power. In this paper, a power-efficient reconfigurable digit serial–serial multiplier is proposed. Power efficiency is achieved using two techniques: reconfiguration and clock gating. Reconfiguration allows the proposed multiplier to perform multiplication of sub-width operands without extending to full width, that is, a multiplier composed of m sub-multipliers each of width n is capable of handling $$mn \times mn$$mn×mn, $$1/2mn \times 1/2mn$$1/2mn×1/2mn, $$1/4mn \times 1/4mn,\ldots , n \times n$$1/4mn×1/4mn,…,n×n multiplications. It also enables the multiplier to perform multiple multiplications concurrently rather than sequentially, that is, the multiplier is capable of handling $$1 \times ( mn \times mn)$$1×(mn×mn), $$2\times (1/2mn \times 1/2), 4\times (1/4mn \times 1/4mn), \ldots , m\times (n\times n)$$2×(1/2mn×1/2),4×(1/4mn×1/4mn),…,m×(n×n) multiplications concurrently. Mathematical operations such as matrix product benefit most from concurrent multiplications. Clock gating is used to reduce power by disabling unused blocks and enabling utilized blocks only when their relevant inputs arrive. Compared with non-reconfigurable no-clock-gating design, simulation results show that the proposed multiplier reduces the power requirement. For $$m=2$$m=2, $$n=32$$n=32, and digit width $$d=4$$d=4 power is reduced by 38 % for $$32\times 32$$32×32 mode and by 49 % for $$2 \times (32 \times 32)$$2×(32×32) mode. Compared with standard parallel multiplier, simulation results also show that the proposed multiplier reduces energy requirement. For $$m=2, n=32$$m=2,n=32, and digit width $$d=32$$d=32, energy is reduced by 46 % for $$32 \times 32$$32×32 mode and by 60 % for $$2 \times (32 \times 32)$$2×(32×32) mode.

[1]  Michael J. Liebelt,et al.  Multiple-precision fixed-point vector multiply-accumulator using shared segmentation , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[2]  C. John Glossner,et al.  A subworld-parallel multiplication and sum-of-squares unit , 2004, IEEE Computer Society Annual Symposium on VLSI.

[3]  Magnus Själander,et al.  A low-leakage twin-precision multiplier using reconfigurable power gating , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[4]  Jia Di Jia Di,et al.  Run-time reconfigurable power-aware pipelined signed array multiplier design , 2003, Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on.

[5]  Hamid R. Arabnia,et al.  Efficient Reversible Logic Design of BCD Subtractors , 2009, Trans. Comput. Sci..

[6]  H. V. Jayashree,et al.  Progress in Reversible Processor Design: A Novel Methodology for Reversible Carry Look-Ahead Adder , 2013, Trans. Comput. Sci..

[7]  Chin-Long Wey,et al.  Design of reconfigurable array multipliers and multiplier-accumulators , 2004, The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, 2004. Proceedings..

[8]  H.-J. Pfleiderer,et al.  Configurable multiplier modules for an adaptive computing system , 2006 .

[9]  Amar Aggoun,et al.  Radix-2n serial-serial multipliers , 2004 .

[10]  Hamid R. Arabnia,et al.  A Need of Quantum Computing: Reversible Logic Synthesis of Parallel Binary Adder-Subtractor , 2005, ESA.

[11]  J. Fridman Sub-word parallelism in digital signal processing , 2000 .

[12]  Mokhtar Nibouche,et al.  On designing digit multipliers , 2002, 9th International Conference on Electronics, Circuits and Systems.

[13]  Hamid R. Arabnia,et al.  Reversible Logic Synthesis of Half, Full and Parallel Subtractors , 2005, ESA.

[14]  Shyh-Jye Jou,et al.  Low-Power Embedded DSP Core for Communication Systems , 2003, EURASIP J. Adv. Signal Process..

[15]  Michael J. Schulte,et al.  Multiplier architectures for media processing , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[16]  Linda M. Wills,et al.  Retargeting sequential image-processing programs for data parallel execution , 2005, IEEE Transactions on Software Engineering.

[17]  Yuan-Hao Huang,et al.  A 1.1 G MAC/s sub-word-parallel digital signal processor for wireless communication applications , 2004 .

[18]  Alexis Vartanian,et al.  Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor , 2001, ICS '01.

[19]  Shiann-Rong Kuang,et al.  Design of power-efficient pipelined truncated multipliers with various output precision , 2007, IET Comput. Digit. Tech..

[20]  Rong Lin Reconfigurable parallel inner product processor architectures , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[21]  Lizy Kurian John,et al.  Exploiting SIMD parallelism in DSP and multimedia algorithms using the AltiVec technology , 1999, ICS '99.

[22]  Peter Y. K. Cheung,et al.  Configurable multiplier blocks for embedding in FPGAs , 1998 .

[23]  Essam Elsayed,et al.  Area-Efficient Digit Serial-Serial Two's Complement Multiplier , 2014, J. Circuits Syst. Comput..

[24]  Lan-Da Van,et al.  Power-efficient pipelined reconfigurable fixed-width Baugh-Wooley multipliers , 2009, IEEE Transactions on Computers.

[25]  Dimitrios Soudris,et al.  Architecture design of a coarse-grain reconfigurable multiply-accumulate unit for data-intensive applications , 2007, Integr..

[26]  Cheng-Wen Wu,et al.  Block multipliers unify bit-level cellular multiplications , 1989 .

[27]  Hamid R. Arabnia,et al.  Reduced Area Low Power High Throughput BCD Adders for IEEE 754r Format , 2006, ArXiv.