论文信息 - An FPGA-specific approach to floating-point accumulation and sum-of-products

An FPGA-specific approach to floating-point accumulation and sum-of-products

This article studies two common situations where the flexibility of FPGAs allows one to design application-specific floating-point operators which are more efficient and more accurate than those offered by processors and GPUs. First, for applications involving the addition of a large number of floating-point values, an ad-hoc accumulator is proposed. By tailoring its parameters to the numerical requirements of the application, it can be made arbitrarily accurate, at an area cost comparable to that of a standard floating-point adder, and at a higher frequency. The second example is the sum-of-product operation, which is the building block of matrix computations. A novel architecture is proposed that feeds the previous accumulator out of a floating-point multiplier whose rounding logic has been removed, again improving the area/accuracy tradeoff. These architectures are implemented within the FloPoCo generator, freely available under the LGPL.

[1] Pavle Belanovic,et al. A Library of Parameterized Floating-Point Modules and Their Use , 2002, FPL.

[2] Ulrich W. Kulisch,et al. Advanced Arithmetic for the Digital Computer, Design of Arithmetic Units , 2002, RealComp.

[3] André DeHon,et al. Floating-point sparse matrix-vector multiply for FPGAs , 2005, FPGA '05.

[4] O. Cref,et al. FPGA-Based Acceleration of the Computations Involved in Transcranial Magnetic Stimulation , 2008, 2008 4th Southern Conference on Programmable Logic.

[5] Peter M. Athanas,et al. Quantitative analysis of floating point arithmetic on FPGA based custom computing machines , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[6] Brent E. Nelson,et al. Novel Optimizations for Hardware Floating-Point Units in a Modern FPGA Architecture , 2002, FPL.

[7] Dennis W. Prather,et al. Floating-Point Accumulation Circuit for Matrix Applications , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[8] Florent de Dinechin,et al. A Tool for Unbiased Comparison between Logarithmic and Floating-point Arithmetic , 2007, J. VLSI Signal Process..

[9] Margaret Martonosi,et al. Accelerating Pipelined Integer and Floating-Point Accumulations in Configurable Hardware with Delayed Addition Techniques , 2000, IEEE Trans. Computers.

[10] N. Burgess,et al. Parameterisable floating-point operations on FPGA , 2002, Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002..

[11] Viktor K. Prasanna,et al. High Performance Linear Algebra Operations on Reconfigurable Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12] Yamin Li,et al. Implementation of single precision floating point square root on FPGAs , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[13] R. Andraka. Hybrid Floating Point Technique Yields 1 . 2 Gigasample Per Second 32 to 2048 point Floating Point FFT in a single FPGA , 2006 .

[14] Florent de Dinechin,et al. Return of the hardware floating-point elementary function , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).

[15] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.

[16] Jianhua Liu,et al. An iterative division algorithm for FPGAs , 2006, FPGA '06.

[17] Jean-Michel Muller,et al. Integer and floating-point constant multipliers for FPGAs , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[18] Scott McMillan,et al. A re-evaluation of the practicality of floating-point operations on FPGAs , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[19] Mi Lu,et al. Group-Alignment based Accurate Floating-Point Summation on FPGAs , 2006, ERSA.

[20] Michael J. Flynn,et al. The case for a redundant format in floating point arithmetic , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[21] Viktor K. Prasanna,et al. Scalable and modular algorithms for floating-point matrix multiplication on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[22] Reinhard Männer,et al. Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[23] Martin Langhammer. Floating point datapath synthesis for FPGAs , 2008, 2008 International Conference on Field Programmable Logic and Applications.