论文信息 - Fast Radix-10 Multiplication Using Redundant BCD Codes

Fast Radix-10 Multiplication Using Redundant BCD Codes

We present the algorithm and architecture of a BCD parallel multiplier that exploits some properties of two different redundant BCD codes to speedup its computation: the redundant BCD excess-3 code (XS-3), and the overloaded BCD representation (ODDS). In addition, new techniques are developed to reduce significantly the latency and area of previous representative high-performance implementations. Partial products are generated in parallel using a signed-digit radix-10 recoding of the BCD multiplier with the digit set [-5, 5], and a set of positive multiplicand multiples (0X, 1X, 2X, 3X, 4X, 5X) coded in XS-3. This encoding has several advantages. First, it is a self-complementing code, so that a negative multiplicand multiple can be obtained by just inverting the bits of the corresponding positive one. Also, the available redundancy allows a fast and simple generation of multiplicand multiples in a carry-free way. Finally, the partial products can be recoded to the ODDS representation by just adding a constant factor into the partial product reduction tree. Since the ODDS uses a similar 4-bit binary encoding as non-redundant BCD, conventional binary VLSI circuit techniques, such as binary carry-save adders and compressor trees, can be adapted efficiently to perform decimal operations. To show the advantages of our architecture, we have synthesized a RTL model for $16\times 16$-digit and $34\times 34$-digit multiplications and performed a comparative survey of the previous most representative designs. We show that the proposed decimal multiplier has an area improvement roughly in the range 20-35 percent for similar target delays with respect to the fastest implementation.

Javier D. Bruguera | Álvaro Vázquez | Elisardo Antelo

[1] Michael F. Cowlishaw,et al. Decimal floating-point: algorism for computers , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[2] G. N. Srinivasa Prasanna,et al. On Basic Financial Decimal Operations on Binary Machines , 2012, IEEE Transactions on Computers.

[3] James Demmel,et al. IEEE Standard for Floating-Point Arithmetic , 2008 .

[4] Michael J. Schulte,et al. A high-frequency decimal multiplier , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[5] R. K. Richards,et al. Arithmetic operations in digital computers , 2013 .

[6] Michael J. Schulte,et al. Decimal multiplication via carry-save addition , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[7] Ghassem Jaberipur,et al. Comment on “High Speed Parallel Decimal Multiplication With Redundant Internal Encodings” , 2015, IEEE Transactions on Computers.

[8] Eric M. Schwarz,et al. IBM POWER6 accelerators: VMX and DFU , 2007, IBM J. Res. Dev..

[9] Ghassem Jaberipur,et al. A fully redundant decimal adder and its application in parallel decimal multipliers , 2009, Microelectron. J..

[10] R. Sacks-Davis,et al. Applications of Redundant Number Representations to Decimal Arithmetic , 1982, Comput. J..

[11] Luigi Dadda. Multioperand Parallel Decimal Adder: A Mixed Binary and BCD Approach , 2007, IEEE Transactions on Computers.

[12] Robert M. Averill,et al. A radix-8 CMOS S/390 multiplier , 1997, Proceedings 13th IEEE Sympsoium on Computer Arithmetic.

[13] M. Bayoumi,et al. Algorithms for Energy-Efficient Query-Reduction in Wireless Sensor Networks , 2007, 2006 International Workshop on Computer Architecture for Machine Perception and Sensing.

[14] Silvia M. Müller,et al. The IBM zEnterprise-196 Decimal Floating-Point Accelerator , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[15] A. Weinberger,et al. High Speed Decimal Addition , 1971, IEEE Transactions on Computers.

[16] Amir Kaivani,et al. Improving the Speed of Parallel Decimal Multiplication , 2009, IEEE Transactions on Computers.

[17] Eric M. Schwarz,et al. Decimal floating-point support on the IBM System z10 processor , 2009, IBM J. Res. Dev..

[18] Seok-Bum Ko,et al. High-Speed Parallel Decimal Multiplication with Redundant Internal Encodings , 2013, IEEE Transactions on Computers.

[19] Paolo Montuschi,et al. A New Family of High.Performance Parallel Decimal Multipliers , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).

[20] Antonin Svoboda. Decimal Adder with Signed Digit Arithmetic , 1969, IEEE Transactions on Computers.

[21] Michael J. Schulte,et al. High-speed multioperand decimal adders , 2005, IEEE Transactions on Computers.

[22] David Y. Y. Yun,et al. RBCD: redundant binary coded decimal adder , 1989 .

[23] Toshio Yoshida,et al. Sparc64 X: Fujitsu's New-Generation 16-Core Processor for Unix Servers , 2013, IEEE Micro.

[24] Luigi Dadda,et al. A variant of a radix-10 combinational multiplier , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[25] Michael J. Schulte,et al. Decimal multiplication with efficient partial product generation , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[26] Eric M. Schwarz,et al. Power6 Decimal Divide , 2007, 2007 IEEE International Conf. on Application-specific Systems, Architectures and Processors (ASAP).

[27] Eric M. Schwarz,et al. A decimal floating-point specification , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[28] A. Nannarelli,et al. A Radix-10 Combinational Multiplier , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[30] Paolo Montuschi,et al. Improved Design of High-Performance Parallel Decimal Multipliers , 2010, IEEE Transactions on Computers.

[31] Michael J. Schulte,et al. A Combined Decimal and Binary Floating-Point Multiplier , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.