Reproducibility, i.e. getting the bitwise identical floating point results from multiple runs of the same program, is a property that many users depend on either for debugging or correctness checking in many codes [1]. However, the combination of dynamic scheduling of parallel computing resources, and floating point nonassociativity, make attaining reproducibility a challenge even for simple reduction operations like computing the sum of a vector of numbers in parallel. We propose a technique for floating point summation that is reproducible independent of the order of summation. Our technique uses Rump's algorithm for error-free vector transformation [2], and is much more efficient than using (possibly very) high precision arithmetic. Our algorithm trades off efficiency and accuracy: we reproducibly attain reasonably accurate results (with an absolute error bound c · n2 · macheps · max |vi| for a small constant c) with just 2n + O(1) floating-point operations, and quite accurate results (with an absolute error bound c · n3 · macheps2 · max |vi| with 5n + O(1) floating point operations, both with just two reduction operations. Higher accuracies are also possible by increasing the number of error-free transformations. As long as the same rounding mode is used, results computed by the proposed algorithms are reproducible for any run on any platform.
[1]
Siegfried M. Rump,et al.
Ultimately Fast Accurate Summation
,
2009,
SIAM J. Sci. Comput..
[2]
Guy E. Blelloch,et al.
Prefix sums and their applications
,
1990
.
[3]
Siegfried M. Rump,et al.
Fast high precision summation
,
2010
.
[4]
Mei Han An,et al.
accuracy and stability of numerical algorithms
,
1991
.
[5]
T. J. Dekker,et al.
A floating-point technique for extending the available precision
,
1971
.
[6]
Siegfried M. Rump,et al.
Accurate Floating-Point Summation Part I: Faithful Rounding
,
2008,
SIAM J. Sci. Comput..
[7]
Siegfried M. Rump,et al.
INTLAB - INTerval LABoratory
,
1998,
SCAN.
[8]
Jürgen Wolff von Gudenberg,et al.
A long accumulator like a carry-save adder
,
2011,
Computing.
[9]
Sriram Krishnamoorthy,et al.
Effects of floating-point non-associativity on numerical computations on massively multithreaded systems
,
2009
.
[10]
T. Csendes.
Developments in Reliable Computing
,
2000
.
[11]
Philip Saponaro,et al.
Improving numerical reproducibility and stability in large-scale numerical simulations on GPUs
,
2010,
2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).