Accurate Sum and Dot Product

Algorithms for summation and dot product of floating-point numbers are presented which are fast in terms of measured computing time. We show that the computed results are as accurate as if computed in twice or K-fold working precision, $K\ge 3$. For twice the working precision our algorithms for summation and dot product are some 40% faster than the corresponding XBLAS routines while sharing similar error estimates. Our algorithms are widely applicable because they require only addition, subtraction, and multiplication of floating-point numbers in the same working precision as the given data. Higher precision is unnecessary, algorithms are straight loops without branch, and no access to mantissa or exponent is necessary.

[1]  Nicholas J. Higham,et al.  The Accuracy of Floating Point Summation , 1993, SIAM J. Sci. Comput..

[2]  James Demmel,et al.  Design, implementation and testing of extended and mixed precision BLAS , 2000, TOMS.

[3]  H. Woźniakowski,et al.  The accurate solution of certain continuous problems using only single precision arithmetic , 1985 .

[4]  David H. Bailey,et al.  A Fortran 90-based multiprecision system , 1995, TOMS.

[5]  I. J. Anderson,et al.  A Distillation Algorithm for Floating-Point Summation , 1999, SIAM J. Sci. Comput..

[6]  W. Miranker,et al.  The arithmetic of the digital computer: A new approach , 1986 .

[7]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[8]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[9]  R. Skeel Iterative refinement implies numerical stability for Gaussian elimination , 1980 .

[10]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[11]  T. J. Dekker,et al.  A floating-point technique for extending the available precision , 1971 .

[12]  John R. Hauser,et al.  Handling floating-point exceptions in numeric programs , 1995, TOPL.

[13]  Henryk Wozniakowski,et al.  A Note on Floating-point Summation of Very Many Terms , 1983, J. Inf. Process. Cybern..

[14]  James Demmel,et al.  Accurate and Efficient Floating Point Summation , 2003, SIAM J. Sci. Comput..

[15]  A. Neumaier Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen , 1974 .

[16]  M. Pichat,et al.  Correction d'une somme en arithmetique a virgule flottante , 1972 .

[17]  Gerd Bohlender,et al.  Floating-point computation of functions with maximum accuracy , 1977, 1975 IEEE 3rd Symposium on Computer Arithmetic (ARITH).

[18]  Seppo Linnainmaa,et al.  Software for Doubled-Precision Floating-Point Computations , 1981, TOMS.

[19]  Jonathan Richard Shewchuk,et al.  Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates , 1997, Discret. Comput. Geom..

[20]  Sylvie Boldo,et al.  Representable correcting terms for possibly underflowing floating point operations , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[21]  Wilhelm Oberaigner,et al.  Parallel algorithms for the rounding exact summation of floating point numbers , 1982, Computing.

[22]  J. Pe Computing the Distance to Infeasibility: Theoretical and Practical Issues , 1998 .

[23]  K. Nickel Das Kahan‐Babuškasche Summierungsverfahren in Triplex‐ ALGOL 60 , 1970 .

[24]  Douglas M. Priest On properties of floating point arithmetics: numerical stability and the cost of accurate computations , 1992 .

[25]  Michael A. Malcolm,et al.  On accurate floating-point summation , 1971, CACM.

[26]  Siegfried M. Rump,et al.  INTLAB - INTerval LABoratory , 1998, SCAN.

[27]  Erich Kaltofen,et al.  Computer algebra handbook , 2002 .

[28]  Peter Kornerup,et al.  Semantics for exact floating point operations , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[29]  Helmut Hlavacs,et al.  Improving the accuracy of numerical integration , 2001 .

[30]  Kenneth L. Clarkson,et al.  Safe and effective determinant evaluation , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[31]  Ivo Babuska Numerical stability in mathematical analysis , 1968, IFIP Congress.

[32]  D. R. Ross Reducing truncation errors using cascading accumulators , 1965, CACM.

[33]  Yves Nievergelt,et al.  Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit , 2003, TOMS.

[34]  Douglas M. Priest,et al.  Algorithms for arbitrary precision floating point arithmetic , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[35]  William Kahan,et al.  A Survey of Error Analysis , 1971, IFIP Congress.