Standard for Floating-Point Arithmetic

This standard specifies interchange and arithmetic formats and methods for binary and decimal floatingpoint arithmetic in computer programming environments. This standard specifies exception conditions and their default handling. An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware. For operations specified in the normative part of this standard, numerical results and exceptions are uniquely determined by the values of the input data, sequence of operations, and destination formats, all under user control.

[1]  M. Cowlishaw Densely packed decimal encoding , 2002 .

[2]  Jean-Michel Muller,et al.  Fast and correctly rounded logarithms in double-precision , 2007, RAIRO Theor. Informatics Appl..

[3]  Jean-Michel Muller,et al.  Elementary Functions: Algorithms and Implementation , 1997 .

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Jean-Michel Muller,et al.  Worst cases for correct rounding of the elementary functions in double precision , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[6]  Peter W. Markstein,et al.  IA-64 and elementary functions - speed and precision , 2000 .

[7]  Douglas M. Priest,et al.  Algorithms for arbitrary precision floating point arithmetic , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[8]  Vincent Lefèvre,et al.  Searching worst cases of a one-variable function using lattice reduction , 2005, IEEE Transactions on Computers.

[9]  Wafaa S. Sayed,et al.  What are the Correct Results for the Special Values of the Operands of the Power Operation? , 2016, ACM Trans. Math. Softw..

[10]  Javier D. Bruguera,et al.  Floating-point fused multiply-add: reduced latency for floating-point addition , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[11]  Jerome Toby Coonen Contributions to a proposed standard for binary floating-point arithmetic (computer arithmetic) , 1984 .

[12]  Eric M. Schwarz,et al.  Hardware implementations of denormalized numbers , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[13]  James Demmel,et al.  Faster Numerical Algorithms via Exception Handling , 1994, IEEE Trans. Computers.

[14]  William Kahan Branch cuts for complex elementary functions , 1987 .

[15]  Michael F. Cowlishaw,et al.  Decimal floating-point: algorism for computers , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[16]  Jean-Michel Muller,et al.  Some functions computable with a fused-mac , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[17]  Erdem Hokenek,et al.  Design of the IBM RISC System/6000 Floating-Point Execution Unit , 1990, IBM J. Res. Dev..

[18]  Florent de Dinechin,et al.  Towards the post-ultimate libm , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).