Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems

Motivated by the demand in machine learning, modern computer hardware is increasingly supporting reduced precision floating-point arithmetic, which provides advantages in speed, energy, and memory ...

[1]  G. Forsythe,et al.  Computer solution of linear algebraic systems , 1969 .

[2]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[3]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[4]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[5]  T. Larsson On scaling linear programs—some experimental results , 1993 .

[6]  James Demmel,et al.  Faster Numerical Algorithms via Exception Handling , 1994, IEEE Trans. Computers.

[7]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[8]  Nikolaos V. Sahinidis,et al.  Scaling linear optimization problems prior to application of the simplex method , 2012, Comput. Optim. Appl..

[9]  Bora Uçar,et al.  A Symmetry Preserving Algorithm for Matrix Scaling , 2014, SIAM J. Matrix Anal. Appl..

[10]  T. Palmer,et al.  More reliable forecasts with less precise computations: a fast-track route to cloud-resolved weather and climate simulators? , 2014, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[12]  Nicholas J. Higham,et al.  A New Analysis of Iterative Refinement and Its Application to Accurate Solution of Ill-Conditioned Sparse Linear Systems , 2017, SIAM J. Sci. Comput..

[13]  Jack J. Dongarra,et al.  Investigating half precision arithmetic to accelerate dense linear system solvers , 2017, ScalA@SC.

[14]  Peter D. Düben,et al.  Reliable low precision simulations in land surface models , 2018, Climate Dynamics.

[15]  Julian Kates-Harbeck,et al.  Training distributed deep recurrent neural networks with mixed precision on GPU clusters , 2017, MLHPC@SC.

[16]  Nicholas J. Higham,et al.  Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Jack J. Dongarra,et al.  The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques , 2018, ICCS.

[18]  Nicholas J. Higham,et al.  Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions , 2018, SIAM J. Sci. Comput..

[19]  Lars Karlsson,et al.  Parallel robust solution of triangular linear systems , 2019, Concurr. Comput. Pract. Exp..

[20]  James Hook,et al.  Max-Balanced Hungarian Scalings , 2019, SIAM J. Matrix Anal. Appl..