The floating-point arithmetic on computers is designed to approximate the corresponding operations over the real numbers as close as possible. In this paper it is shown by means of counterexamples that this need not to be true for existing machines. For achieving good numerical results a floating-point arithmetic approximating the real operations as close as possible is probably best. For achieving verifications on computers, at least a precisely defined computer arithmetic is indispensable. In this paper we first introduce the Kulisch/Miranker theory, which represents a sound basis for computer arithmetic. Each operation is precisely defined and, moreover, is of maximum accuracy. That means, the, computed result is the floating-point number of the working precision closest to the infinite precise result. The theory also covers directed roundings allowing computations with intervals. These properties hold true for the floating-point numbers of single and double precision as well as for the vectors, matrices and complex extensions over those. In the second part of the paper we demonstrate the theoretical basis for what we call ‘Higher Order Computer Arithmetic’. This is an inclusion theory allowing the development of algorithms to compute bounds for the solution of various problems in numerical analysis. These bounds are automatically verified to be correct and they are of high accuracy. Very often they are of maximum accuracy, that means the left and right bounds of all components of the solution are adjacent in the floating-point screen. Moreover existence and uniqueness of a solution within the computed bounds is automatically verified by the algorithm. If this verification is not possible, a respective message is given. We develop the theory and give algorithms for the solution of systems of linear and nonlinear equations. As demonstrated by examples even for extremely ill-conditioned problems existence and uniqueness of the solution is verified within bounds of least significant bit accuracy.
[1]
D. Matula,et al.
Foundations of Finite Precision Rational Arithmetic
,
1980
.
[2]
Pat H. Sterbenz,et al.
Floating-point computation
,
1973
.
[3]
R. Brent,et al.
Fast local convergence with single and multistep methods for nonlinear equations
,
1975,
The Journal of the Australian Mathematical Society. Series B. Applied Mathematics.
[4]
David W. Matula,et al.
A Formalization of Floating-Point Numeric Base Conversion
,
1970,
IEEE Transactions on Computers.
[5]
Gerd Bohlender,et al.
Floating-Point Computation of Functions with Maximum Accuracy
,
1975,
IEEE Transactions on Computers.
[6]
James Wong.
COMPUTER SCIENCE DEPARTMENT
,
1971
.
[7]
G. Forsythe.
Pitfalls in computation, or why a math book isn''t enough
,
1970
.
[8]
J. Gillis,et al.
Matrix Iterative Analysis
,
1961
.
[9]
U. Kulisch.
An axiomatic approach to rounded computations
,
1971
.
[10]
Gerd Bohlender,et al.
Floating-point computation of functions with maximum accuracy
,
1977,
1975 IEEE 3rd Symposium on Computer Arithmetic (ARITH).
[11]
Siegfried M. Rump,et al.
Solution of linear and nonlinear algebraic problems with sharp, guaranteed bounds
,
1984
.
[12]
David W. Matula,et al.
A Simulative Study of Correlated Error Propagation in Various Finite-Precision Arithmetics
,
1973,
IEEE Transactions on Computers.
[13]
Earl E. Swartzlander,et al.
The Sign/Logarithm Number System
,
1975,
IEEE Transactions on Computers.
[14]
David W. Matula.
Fixed-slash and floating-slash rational arithmetic
,
1975,
1975 IEEE 3rd Symposium on Computer Arithmetic (ARITH).
[15]
Richard J. Fateman.
The MACSYMA “big-floating-point” arithmetic system
,
1976,
SYMSAC '76.