Analysis of some known methods of improving the accuracy of floating-point sums

Some well-known methods for calculating the round-off error in floating-point addition are analyzed in this paper. The methods have been introduced by Møller [16], Kahan [11] and Knuth [12]. The necessary and sufficient conditions under which these methods produce the value of the round-off error, for rounding, truncating and parity arithmetic, are given. The computer-oriented parity arithmetic is not commonly known, but it has some desirable properties, as this paper will demonstrate. Some experimental results are also reported.

[1]  S. Gill,et al.  A process for the step-by-step integration of differential equations in an automatic digital computing machine , 1951, Mathematical Proceedings of the Cambridge Philosophical Society.

[2]  J. B. Scarborough Numerical Mathematical Analysis , 1931 .

[3]  Jack M. Wolfe Reducing truncation errors by programming , 1964, CACM.

[4]  Christian Gram On the representation of zero in floating-point arithmetic , 1964 .

[5]  P. Henrici Elements of numerical analysis , 1966 .

[6]  Ole Møller Quasi double-precision in floating point addition , 1965 .

[7]  Ole Møller Note on quasi double-precision , 1965 .

[8]  William Kahan,et al.  Pracniques: further remarks on reducing truncation errors , 1965, CACM.

[9]  Peter Naur The performance of a system for automatic segmentation of programs within an ALGOL compiler (GIER ALGOL) , 1965, CACM.

[10]  David Hutchinson,et al.  Letters to the editor: a final word on reducing truncation errors , 1965, CACM.

[11]  Donald Ervin Knuth,et al.  The Art of Computer Programming, Volume II: Seminumerical Algorithms , 1970 .

[12]  T. J. Dekker,et al.  A floating-point technique for extending the available precision , 1971 .

[13]  Peter Linz,et al.  Accurate floating-point summation , 1970, CACM.

[14]  Robert J. Thompson Improving round-off in Runge-Kutta computations with Gill's method , 1970, CACM.

[15]  Michael A. Malcolm,et al.  On accurate floating-point summation , 1971, CACM.

[16]  J. B. Gosling,et al.  Design of large high-speed floating-point-arithmetic units , 1971 .

[17]  James Gregory A comparison of floating point summation methods , 1972, CACM.