Design tradeoff analysis of floating-point adders in FPGAs

With gate counts of ten million, field-programmable gate arrays (FPGAs) are becoming suitable for floating-point computations. Addition is the most complex operation in a floating-point unit and can cause major delay while requiring a significant area. Over the years, the VLSI community has developed many floating-point adder algorithms aimed primarily at reducing the overall latency. An efficient design of the floating-point adder offers major area and performance improvements for FPGAs. Given recent advances in FPGA architecture and area density, latency has become the main focus in attempts to improve performance. This paper studies the implementation of standard; leading-one predictor (LOP); and far and close datapath (2-path) floating-point addition algorithms in FPGAs. Each algorithm has complex sub-operations which contribute significantly to the overall latency of the design. Each of the sub-operations is researched for different implementations and is then synthesized onto a Xilinx Virtex-II Pro FPGA device. Standard and LOP algorithms are also pipelined into five stages and compared with the Xilinx IP. According to the results, the standard algorithm is the best implementation with respect to area, but has a large overall latency of 27.059 ns while occupying 541 slices. The LOP algorithm reduces latency by 6.5% at the cost of a 38% increase in area compared to the standard algorithm. The 2-path implementation shows a 19% reduction in latency with an added expense of 88% in area compared to the standard algorithm. The five-stage standard pipeline implementation shows a 6.4% improvement in clock speed compared to the Xilinx IP with a 23% smaller area requirement. The five-stage pipelined LOP implementation shows a 22% improvement in clock speed compared to the Xilinx IP at a cost of 15% more area.

[1]  EvenGuy,et al.  Delay-Optimized Implementation of IEEE Floating-Point Addition , 2004 .

[2]  Michael J. Flynn,et al.  Leading One Detection --- Implementation, Generalization, and Application , 1991 .

[3]  Stuart Franklin Oberman,et al.  Design issues in high performance floating point arithmetic units , 1996 .

[4]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[5]  Michael J. Flynn,et al.  The SNAP project: design of floating point arithmetic units , 1997, Proceedings 13th IEEE Sympsoium on Computer Arithmetic.

[6]  Russell Tessier,et al.  Floating point unit generation and evaluation for FPGAs , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..

[7]  Israel Koren Computer arithmetic algorithms , 1993 .

[8]  Michael J. Flynn,et al.  Advanced Computer Arithmetic Design , 2001 .

[9]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[10]  Peter-Michael Seidel,et al.  Delay-optimized implementation of IEEE floating-point addition , 2004, IEEE Transactions on Computers.

[11]  Paul Michael Farmwald,et al.  On the design of high performance digital arithmetic units , 1981 .

[12]  Francesco Pappalardo,et al.  An application-oriented analysis of power/precision trade-off in fixed and floating-point arithmetic units for VLSI processors , 2004, Circuits, Signals, and Systems.

[13]  Ansi Ieee,et al.  IEEE Standard for Binary Floating Point Arithmetic , 1985 .

[14]  Brent E. Nelson,et al.  Novel Optimizations for Hardware Floating-Point Units in a Modern FPGA Architecture , 2002, FPL.

[15]  Todd A. Cook,et al.  Implementation of IEEE single precision floating point addition and multiplication on FPGAs , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[16]  Seok-Bum Ko,et al.  FPGA Implementation of a Face Detector using Neural Networks , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[17]  Javier D. Bruguera,et al.  Using the reverse-carry approach for double datapath floating-point addition , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[18]  Peter M. Athanas,et al.  Quantitative analysis of floating point arithmetic on FPGA based custom computing machines , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[19]  Javier D. Bruguera,et al.  Leading-One Prediction with Concurrent Position Correction , 1999, IEEE Trans. Computers.

[20]  Seok-Bum Ko,et al.  Effective implementation of floating-point adder using pipelined LOP in FPGAs , 2005, Canadian Conference on Electrical and Computer Engineering, 2005..

[21]  Scott McMillan,et al.  A re-evaluation of the practicality of floating-point operations on FPGAs , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[22]  Viktor K. Prasanna,et al.  Analysis of high-performance floating-point arithmetic on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[23]  Vojin G. Oklobdzija,et al.  An algorithmic and novel design of a leading zero detector circuit: comparison with logic synthesis , 1994, IEEE Trans. Very Large Scale Integr. Syst..