Multiple-mode floating-point multiply-add fused unit for trading accuracy with power consumption

With the wide use of floating-point (FP) multiply and accumulate operations in multimedia and digital signal processing applications, many modern processors adopt FP multiply-add fused unit (MAF) to achieve high performance, improve accuracy and reduce power consumption. However, FP arithmetic units usually occupy the major portion of a processor's area and power dissipation. In this paper, we will propose a multiple-mode FP multiply-add fused unit which utilizes the iterative multiplication and truncated addition techniques to support seven operating modes with various errors for low power applications. It can execute either one multiply-accumulate operation with three modes, one multiplication operation with two modes or one addition operation with two modes. When compared to the traditional IEEE754 single-precision FP MAF, the proposed unit has 4.5% less area and 23% longer delay to achieve multiple modes which can sacrifice a little (<; 1%) accuracy for saving large (> 33%) power consumption.

[1]  Milos D. Ercegovac,et al.  High-level optimization techniques for low-power multiplier design , 2003 .

[2]  Earl E. Swartzlander,et al.  Bridge Floating-Point Fused Multiply-Add Design , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[4]  E.E. Swartzlander,et al.  Fused floating-point arithmetic for DSP , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[5]  Ashok Kumar,et al.  The HP PA-8000 RISC CPU , 1997, IEEE Micro.

[6]  Shiann-Rong Kuang,et al.  Energy-Efficient Multiple-Precision Floating-Point Multiplier for Embedded Applications , 2013, J. Signal Process. Syst..

[7]  E.E. Swartzlander,et al.  Floating-Point Fused Multiply-Add Architectures , 2007, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers.

[8]  Peter W. Cook,et al.  Second-generation RISC floating point with multiply-add fused , 1990 .

[9]  D. H. Jacobsohn,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[10]  John Harrison,et al.  Scientific computing on the Itanium® processor , 2002 .

[11]  M. Valero,et al.  Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.

[12]  Han Limin,et al.  A novel floating-point function unit combining MAF and 3-input adder , 2012, 2012 IEEE International Conference on Signal Processing, Communication and Computing (ICSPCC 2012).

[13]  Li Shen,et al.  A New Architecture For Multiple-Precision Floating-Point Multiply-Add Fused Unit Design , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).

[14]  Nong Xiao,et al.  Low-Cost Binary128 Floating-Point FMA Unit Design with SIMD Support , 2012, IEEE Transactions on Computers.

[15]  Silvia M. Müller,et al.  Advanced Clockgating Schemes for Fused-Multiply-Add-Type Floating-Point Units , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.

[16]  Chih-Wei Liu,et al.  A Compact DSP Core with Static Floating-Point Arithmetic , 2006, J. VLSI Signal Process..

[17]  Sanu Mathew,et al.  A 1.45GHz 52-to-162GFLOPS/W variable-precision floating-point fused multiply-add unit with certainty tracking in 32nm CMOS , 2012, 2012 IEEE International Solid-State Circuits Conference.

[18]  Kevin J. Nowka,et al.  Leading zero anticipation and detection-a comparison of methods , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[19]  Dionysios I. Reisis,et al.  An efficient dual-mode floating-point Multiply-Add Fused Unit , 2010, 2010 17th IEEE International Conference on Electronics, Circuits and Systems.

[20]  Markus Püschel,et al.  Mechanical Derivation of Fused Multiply–Add Algorithms for Linear Transforms , 2007, IEEE Transactions on Signal Processing.

[21]  F. Pappalardo,et al.  Low-power floating-point encoding for signal processing applications , 2003, 2003 IEEE Workshop on Signal Processing Systems (IEEE Cat. No.03TH8682).

[22]  John Harrison,et al.  Scientific Computing on the Itanium ™ Processor , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[23]  Asim J. Al-Khalili,et al.  Low power floating point MAFs-a comparative study , 2001, Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467).

[24]  Markus Püschel,et al.  Automatic generation of implementations for DSP transforms on fused multiply-add architectures , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.