architect: Arbitrary-Precision Hardware With Digit Elision for Efficient Iterative Compute

Many algorithms feature an iterative loop that converges to the result of interest. The numerical operations in such algorithms are generally implemented using finite-precision arithmetic, either fixed- or floating-point, most of which operate least-significant digit first. This results in a fundamental problem: if, after some time, the result has not converged, is this because we have not run the algorithm for enough iterations or because the arithmetic in some iterations was insufficiently precise? There is no easy way to answer this question, so users will often over-budget precision in the hope that the answer will always be to run for a few more iterations. We propose a fundamentally new approach: with the appropriate arithmetic able to generate results from most-significant digit first, we show that fixed compute-area hardware can be used to calculate an arbitrary number of algorithmic iterations to arbitrary precision, with both precision and approximant index increasing in lockstep. Consequently, datapaths constructed following our principles demonstrate efficiency over their traditional arithmetic equivalents where the latter’s precisions are either under- or over-budgeted for the computation of a result to a particular accuracy. Use of most-significant digit-first arithmetic additionally allows us to declare certain digits to be stable at runtime, avoiding their recalculation in subsequent iterations and thereby increasing performance and decreasing memory footprints. Versus arbitrary-precision iterative solvers without the optimizations we detail herein, we achieve up-to $16\times $ performance speedups and $1.9\times $ memory savings for the evaluated benchmarks.

[1]  Eric C. Kerrigan,et al.  More Flops or More Precision? Accuracy Parameterizable Linear Equation Solvers for Model Predictive Control , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[2]  E. K. Miller A computational study of the effect of matrix size and type, condition number, coefficient accuracy and computation precision on matrix-solution accuracy , 1995, IEEE Antennas and Propagation Society International Symposium. 1995 Digest.

[3]  Gregory D. Peterson,et al.  High-Performance Mixed-Precision Linear Solver for FPGAs , 2008, IEEE Transactions on Computers.

[4]  Mike Scott,et al.  Serial and parallel interleaved modular multipliers on FPGA platform , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[5]  David R. Kincaid,et al.  Numerical mathematics and computing , 1980 .

[6]  George A. Constantinides,et al.  Numerical Data Representations for FPGA-Based Scientific Computing , 2011, IEEE Design & Test of Computers.

[7]  Jean Vuillemin,et al.  BigNum: A Portable and Efficient Package for Arbitrary-Precision Arithmetic , 1989 .

[8]  Florent de Dinechin,et al.  Designing Custom Arithmetic Data Paths with FloPoCo , 2011, IEEE Design & Test of Computers.

[9]  刘强,et al.  FPGA-based Acceleration of Davidon-Fletcher-Powell Quasi-Newton Optimization Method* , 2016 .

[10]  Michele Benzi,et al.  Analysis of Monte Carlo accelerated iterative methods for sparse linear systems , 2017, Numer. Linear Algebra Appl..

[11]  Chip-Hong Chang,et al.  A High Bit Rate Serial-Serial Multiplier With On-the-Fly Accumulation by Asynchronous Counters , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  George A. Constantinides,et al.  Efficient FPGA implementation of digit parallel online arithmetic operators , 2014, 2014 International Conference on Field-Programmable Technology (FPT).

[13]  A. Cheng Multiquadric and its shape parameter—A numerical investigation of error estimate, condition number, and round-off error by arbitrary precision computation , 2012 .

[14]  Patrick Cégielski,et al.  Decidability of the theory of the natural integers with the cantor pairing function and the successor , 2001, Theor. Comput. Sci..

[15]  Xin Fang,et al.  Open-Source Variable-Precision Floating-Point Library for Major Commercial FPGAs , 2016, ACM Trans. Reconfigurable Technol. Syst..

[16]  He Li,et al.  architect: Arbitrary-precision constant-hardware iterative compute , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).

[17]  George A. Constantinides,et al.  An FPGA-based implementation of the MINRES algorithm , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[18]  He Li,et al.  Digit Elision for Arbitrary-accuracy Iterative Computation , 2018, 2018 IEEE 25th Symposium on Computer Arithmetic (ARITH).

[19]  Milos D. Ercegovac,et al.  Digital Arithmetic , 2003, Wiley Encyclopedia of Computer Science and Engineering.

[20]  Michael J. Flynn,et al.  Division Algorithms and Implementations , 1997, IEEE Trans. Computers.

[21]  Fredrik Johansson,et al.  Arb: Efficient Arbitrary-Precision Midpoint-Radius Interval Arithmetic , 2016, IEEE Transactions on Computers.

[22]  Jonathan M. Borwein,et al.  High-precision arithmetic in mathematical physics , 2015 .

[23]  Yiren Zhao,et al.  An efficient implementation of online arithmetic , 2016, 2016 International Conference on Field-Programmable Technology (FPT).

[24]  Milos D. Ercegovac,et al.  Design of on-line division unit , 1989, Proceedings of 9th Symposium on Computer Arithmetic.

[25]  Hayden Kwok-Hay So,et al.  Area-Efficient Architecture for Dual-Mode Double Precision Floating Point Division , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[26]  Greg Stitt,et al.  Revisiting Serial Arithmetic: A Performance and Tradeoff Analysis for Parallel Applications on Modern FPGAs , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[27]  C. Kelley Iterative Methods for Linear and Nonlinear Equations , 1987 .

[28]  Jonathan M. Borwein,et al.  High-precision computation: Mathematical physics and dynamics , 2010, Appl. Math. Comput..

[29]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..