论文信息 - SIMDizing pairwise sums: a summation algorithm balancing accuracy with throughput

SIMDizing pairwise sums: a summation algorithm balancing accuracy with throughput

Implementing summation when accuracy and throughput need to be balanced is a challenging endevour. We present experimental results that provide a sense when to start worrying and the expense of the various solutions that exist. We also present a new algorithm based on pairwise summation that achieves 89% of the throughput of the fastest summation algorithms when the data is not resident in L1 cache while eclipsing the accuracy of signifigantly slower compensated sums like Kahan summation and Kahan-Babuska that are typically used when accuracy is important.

Bob Blainey | Amy Wang | Barnaby Dalton

[1] William Kahan,et al. Pracniques: further remarks on reducing truncation errors , 1965, CACM.

[2] Tommy Färnqvist. Number Theory Meets Cache Locality – Efficient Implementation of a Small Prime FFT for the GNU Multiple Precision Arithmetic Library , 2005 .

[3] Nicholas J. Higham,et al. Accuracy and stability of numerical algorithms, Second Edition , 2002 .

[4] A. Neumaier. Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen , 1974 .