论文信息 - THE ACCURACY OF FLOATING POINT SUMMATIONS FOR CG-LIKE METHODS

THE ACCURACY OF FLOATING POINT SUMMATIONS FOR CG-LIKE METHODS

It is well known that di erent ordering of summations in oating point arithmetic can give di erent sums due to rounding error. This dissertation reviews classic analytic error bounds. A new accurate algorithm is explained thoroughly along with its analytic error bound. These summation algorithms were implemented as dotproducts in an iterative solver to determine which summation ordering is more accurate in practice. Another issue is the relationship between dotproduct accuracy and the convergence of iterative solvers. Analysis and experiments indicate there are two primary sources of errors, and show which summation methods are better for reducing these errors. Results also indicate little correlation between dotproduct accuracy and numbers of iterations required by a solver, within a wide range of accuracies. 1 Statement of problem Floating point summation is a thoroughly studied area in computer science. It is well known that di erent summation orders can give greatly di erent sums. In oating point representation, a number is represented by a mantissa and an exponent, where the mantissa part has a limited number of digits. When two oating point numbers are added, round-o occurs. Many numerical analysts have developed analytic error bounds which give upper bounds on the rounding error for di erent summation algorithms. There has been less success in using these error bounds to predict the actual accuracy of di erent summation algorithms in practice, and typically the algorithms are tested and compared for carefully constructed examples. A famous example of this type sums the Taylor series expansion for e at x = 20 [17]. The conventional analytic approach has two major aws. One is that even if error bounds are studied in detail, the actual accuracy achieved in practice by an algorithm cannot be determined. Error bounds only guarantee that the rounding error does not exceed the given bound. The research question here is which algorithm gives the best accuracy most of the time and not which algorithm gives the sharpest error bound. This dissertation reviews these analytic error bounds, but the goal is not a detailed study of error bounds. Instead we ask which algorithm gives the best accuracy. Another aw with the conventional approach is that in real applications the data is not random, and it rarely ts into the extreme examples constructed to demonstrate analytic error bounds. So the study of well-constructed examples need not help in choosing a method for real applications. This dissertation analyzes dot products occuring in iterative solvers for linear systems of equations. Dot products account for virtually all of the computational work in iterative solvers, particularly for CG-like methods which are based on matrix-vector products and triangular solvers from incomplete factorizations. The main concern is with the relation of the accuracy of dot products to the overall performance of iterative solvers. As an example of this, Figure 1.1 shows the convergence history for the conjugate gradient (CG) algorithm applied to a linear system Ax = b. The matrix A is from discretizing the Laplace operator 4u = 0 using seven-point centered di erences on a uniform 24 24 24 mesh. Even for

Etsuko Mizukami | Etsuko Mizukami

[1] Henk A. van der Vorst,et al. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..

[2] Leon S. Lasdon,et al. Optimal design of efficient acoustic antenna arrays , 1987, Math. Program..

[3] Irene A. Stegun,et al. Pitfalls in Computation , 1956 .

[4] Jack M. Wolfe. Reducing truncation errors by programming , 1964, CACM.

[5] D. J. Mills,et al. Effect of rounding errors on the variable metric method , 1994 .

[6] Zahari Zlatev. Implementation of the Algorithms , 1991 .

[7] Yozo Hida,et al. Accurate oating point summation , 2022 .

[8] Daniel D. McCracken,et al. Numerical methods and FORTRAN programming , 1964 .

[9] D. R. Ross. Reducing truncation errors using cascading accumulators , 1965, CACM.

[10] Nicholas J. Higham,et al. INVERSE PROBLEMS NEWSLETTER , 1991 .

[11] W. Miranker,et al. The arithmetic of the digital computer: A new approach , 1986 .

[12] Iain S. Duff,et al. Users' guide for the Harwell-Boeing sparse matrix collection (Release 1) , 1992 .