FPGA implementation of very high radix square root with prescaling

In this paper we investigate the implementation of a very high radix square root unit with prescaling on a modern FPGA. A parameterized sequential architecture has been implemented for Xilinx Spartan-6 devices. It relies on a specially handcrafted multiply-accumulate (MAC) unit implemented with DSP48A1 slices in order to perform multiply and multiply-add operations required by the algorithm. The proposed scheme uses all the features of Spartan-6 DSP slice: pre-adders, 18×18 bits multiplier and the post-adders. The MAC unit takes advantage of the limited size of one of the operands. Therefore, a small number of DSP blocks is used, which increases linearly with the operand size. Results show that implementations for high precision numbers (quad) are favored in terms of performance.

[1]  Tomás Lang,et al.  Very High Radix Square Root with Prescaling and Rounding and a Combined Division/Square Root Unit , 1999, IEEE Trans. Computers.

[2]  Florent de Dinechin,et al.  Multiplicative Square Root Algorithms for FPGAs , 2010, 2010 International Conference on Field Programmable Logic and Applications.

[3]  Javier D. Bruguera,et al.  High-Speed Double-Precision Computation of Reciprocal, Division, Square Root and Inverse Square Root , 2002, IEEE Trans. Computers.

[4]  Miriam Leeser,et al.  Advanced Components in the Variable Precision Floating-Point Library , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[5]  Stefan Lachowicz,et al.  Fast Evaluation of the Square Root and Other Nonlinear Functions in FPGA , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[6]  Michael J. Flynn,et al.  Division Algorithms and Implementations , 1997, IEEE Trans. Computers.

[7]  Tomás Lang,et al.  Correctly rounded reciprocal square-root by digit recurrence and radix-4 implementation , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[8]  Brent E. Nelson,et al.  Tradeoffs of designing floating-point division and square root on Virtex FPGAs , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..

[9]  Abdel Ejnioui,et al.  Pipelining of double precision floating point division and square root operations , 2006, ACM-SE 44.

[10]  Oana Boncalo,et al.  Implementation of very high radix division in FPGAs , 2012 .