Evaluating the Numerical Stability of Posit Arithmetic

The Posit number format has been proposed by John Gustafson as an alternative to the IEEE 754 standard floatingpoint format. Posits offer a unique form of tapered precision whereas IEEE floating-point numbers provide the same relative precision across most of their representational range. Posits are argued to have a variety of advantages including better numerical stability and simpler exception handling.The objective of this paper is to evaluate the numerical stability of Posits for solving linear systems where we evaluate Conjugate Gradient Method to demonstrate an iterative solver and Cholesky-Factorization to demonstrate a direct solver. We show that Posits do not consistently improve stability across a wide range of matrices, but we demonstrate that a simple rescaling of the underlying matrix improves convergence rates for Conjugate Gradient Method and reduces backward error for Cholesky Factorization. We also demonstrate that 16-bit Posit outperforms Float16 for mixed precision iterative refinement - especially when used in conjunction with a recently proposed matrix re-scaling strategy proposed by Nicholas Higham.

[1]  Stef Graillat,et al.  Rounding Errors , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[2]  Vincent Lefèvre,et al.  Why and How to Use Arbitrary Precision , 2010, Comput. Sci. Eng..

[3]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[4]  Jonathan M. Borwein,et al.  High-precision computation: Mathematical physics and dynamics , 2010, Appl. Math. Comput..

[5]  Wim Vanroose,et al.  Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm , 2014, Parallel Comput..

[6]  John L. Gustafson,et al.  Beating Floating Point at its Own Game: Posit Arithmetic , 2017, Supercomput. Front. Innov..

[7]  Peter Lindstrom,et al.  Universal coding of the reals: alternatives to IEEE floating point , 2018 .

[8]  John Shalf,et al.  Extending Summation Precision for Network Reduction Operations , 2013, 2013 25th International Symposium on Computer Architecture and High Performance Computing.

[9]  Nicholas J. Higham,et al.  Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems , 2019, SIAM J. Sci. Comput..

[10]  Jean-Michel Muller,et al.  Posits: the good, the bad and the ugly , 2019, CoNGA'19.

[11]  Nicholas J. Higham,et al.  Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Nicholas J. Higham,et al.  Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions , 2018, SIAM J. Sci. Comput..

[13]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[14]  David S. Gilliam,et al.  The impact of finite precision arithmetic and sensitivity on the numerical solution of partial differential equations , 2002 .