The Calculation and Anticipation Unit for Floating-Point Addition

Most recent microprocessors present multiple special functional units to optimize their performance. In this paper, a new functional unit called the calculation and anticipation (C&A) unit is presented for the IEEE 754 standard floating-point adder (FPA) that is the most important and frequently used calculation part for both modern CPUs and GPUs. C&A unit parallelize rounding step and readjustment step, which are known as the time-consuming steps for floating-point addition with significand addition. Therefore it reduces FPA critical path delay enormously, and even more decreases a little FPA area occupation. The synthesis results show that the double-precision FPA with C&A unit takes about 17.17% improvement in the critical path delay, while saves about 8.32% area than the conventional one. It takes 5.90% advantage in area and 19.58% improvement in the worst case delay than the double-precision FPA from the Open Core module "fpu_double" (rev 14 2010-02-13) synthesized in the same 0.13-μm CMOS bulk. Furthermore, comparing with the two-path double-precision FPA synthesized using LSI Logic's gflxp 0.11-μm CMOS library, it takes about 4.30% advantage in the critical path delay, and saves almost one-third area in the number of the individual cells.

[1]  Li Wang,et al.  High-Speed Error-Correction for Leading Zero/One Anticipator , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[2]  Giorgos Dimitrakopoulos,et al.  Low-Power Leading-Zero Counting and Anticipation Logic for High-Speed Floating Point Units , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Ahmet Akkas Dual-mode floating-point adder architectures , 2008, J. Syst. Archit..

[4]  Rajit Manohar,et al.  An Operand-Optimized Asynchronous IEEE 754 Double-Precision Floating-Point Adder , 2010, 2010 IEEE Symposium on Asynchronous Circuits and Systems.

[5]  Guy Even,et al.  An IEEE Compliant Floating-Point Adder that Conforms with the Pipelined Packet-Forwarding Paradigm , 2000, IEEE Trans. Computers.

[6]  Hong Wang,et al.  Desynchronize a legacy floating-point adder with operand-dependant delay elements , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[7]  Dake Liu,et al.  Implementation of a Floating Point Adder and Subtracter in NoGAP, A Comparative Case Study , 2010, 2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[8]  S. Ghosh,et al.  FPGA based implementation of a double precision IEEE floating-point adder , 2013, 2013 7th International Conference on Intelligent Systems and Control (ISCO).

[9]  Mark Horowitz,et al.  Energy-Efficient Floating-Point Unit Design , 2011, IEEE Transactions on Computers.

[10]  Peter-Michael Seidel,et al.  Delay-optimized implementation of IEEE floating-point addition , 2004, IEEE Transactions on Computers.

[11]  Cheng-Chew Lim,et al.  Reduced latency IEEE floating-point standard adder architectures , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[12]  Kevin Reick,et al.  Power4 System Design for High Reliability , 2002, IEEE Micro.