Very low resource table-based FPGA evaluation of elementary functions

This paper analyzes the FPGA implementation of polynomial-based function evaluation specifically considering the embedded block RAMs and multiplier-adders available in today's technologies. The computation of the reciprocal, square root and inverse square root functions using first and second order polynomial approximations is discussed, in particular. In each case, the most appropriate sizes for the interpolation intervals are selected according to the maximum polynomial approximation errors. Upper-bounds for the truncation errors are formally derived in order to find the most appropriate sizes for the polynomial coefficients and fixed-point operands. The bit-sizes of the polynomial coefficients are optimized so that all the required values fit in only one 36Kbit BRAM. Further, the word lengths and the number of fractional bits of the operands are adjusted so that the fixed-point multiplications and additions can be implemented with the 17×24 unsigned multipliers and 48-bit adders available in the FPGA DSP blocks. The experimental results confirm that a straightforward implementation of the function evaluator using one BRAM and two DSP blocks can provide more than single-precision. Additionally, an implementation with one BRAM and three DSPs can provide a precision of 28-bits, which is more than adequate to generate the seed for a double-precision operator using one additional Newton-Raphson iteration.

[1]  Josef Goette,et al.  An Efficient Hardware Implementation for a Reciprocal Unit , 2010, 2010 Fifth IEEE International Symposium on Electronic Design, Test & Applications.

[2]  W. Fraser,et al.  A Survey of Methods of Computing Minimax and Near-Minimax Polynomial Approximations for Functions of a Single Independent Variable , 1965, JACM.

[3]  Jean-Michel Muller,et al.  "Partially rounded" small-order approximations for accurate, hardware-oriented, table-based methods , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[4]  Stefan Lachowicz,et al.  Fast Evaluation of the Square Root and Other Nonlinear Functions in FPGA , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[5]  Hossam A. H. Fahmy,et al.  Algorithmic truncation of minimax polynomial coefficients , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[6]  Jean-Michel Muller,et al.  Elementary Functions: Algorithms and Implementation , 1997 .

[7]  Florent de Dinechin,et al.  Table-based polynomials for fast hardware function evaluation , 2005, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05).

[8]  Arnaud Tisserand,et al.  Reciprocation, Square Root, Inverse Square Root, and Some Elementary Functions Using Small Multipliers , 2000, IEEE Trans. Computers.

[9]  James William Hauser,et al.  Approximation of Nonlinear Functions for Fixed-Point and ASIC Applications Using a Genetic Algorithm , 2001 .

[10]  Vijay K. Jain,et al.  High-speed double precision computation of nonlinear functions , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.

[11]  Mário P. Véstias,et al.  Analysis of matrix multiplication on high density Virtex-7 FPGA , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[12]  Javier D. Bruguera,et al.  High-Speed Double-Precision Computation of Reciprocal, Division, Square Root and Inverse Square Root , 2002, IEEE Trans. Computers.