Design and implementation of double precision floating point division and square root on FPGAs

This paper presents the sequential and pipelined designs of a double precision floating point divider and square root unit. The pipelining of these units is based on partial and full unrolling of the iterations in low-radix digit recurrence algorithms. These units are synthesized to produce common-denominator implementations that can be mapped on any FPGA chip regardless of architectural differences between the chips. The implementations of these designs show that their performances are comparable to, and sometimes higher than, the performances of non-iterative designs based on high radix numbers. While the iterative divider and square root unit occupy less than 1% of an XC2V6000 FPGA chip, their pipelined counterparts can produce throughputs that reach the 100 MFLOPS mark by consuming a modest 8% of the chip area. The pipelining of these iterative designs target high throughput computations encountered in some space applications

[1]  Kurt Keutzer Challenges in CAD for the one million gate FPGA , 1997, FPGA '97.

[2]  Brent E. Nelson,et al.  Gigaop DSP on FPGA , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[4]  Jonathan P. How,et al.  Enabling Spacecraft Formation Flying through Spaceborne GPS and Enhanced Automation Technologies , 1999 .

[5]  Stavros Paschalakis,et al.  Double precision floating-point arithmetic on FPGAs , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[6]  Ansi Ieee,et al.  IEEE Standard for Binary Floating Point Arithmetic , 1985 .

[7]  Viktor K. Prasanna,et al.  A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing , 2005, ERSA.

[8]  Viktor K. Prasanna,et al.  Efficient Floating-point Based Block LU Decomposition on FPGAs , 2004, ERSA.

[9]  Viktor K. Prasanna,et al.  Computing Lennard-Jones Potentials and Forces with Reconfigurable Hardware , 2004, ERSA.

[10]  Brent E. Nelson,et al.  Tradeoffs of designing floating-point division and square root on Virtex FPGAs , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..

[11]  Keith D. Underwood,et al.  FPGAs vs. CPUs: trends in peak floating-point performance , 2004, FPGA '04.

[12]  Yamin Li,et al.  Implementation of single precision floating point square root on FPGAs , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[13]  N. Burgess,et al.  Parameterisable floating-point operations on FPGA , 2002, Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002..