Improving the Performance of the GMRES Method using Mixed-Precision Techniques

The GMRES method is used to solve sparse, non-symmetric systems of linear equations arising from many scientific applications. The solver performance within a single node is memory bound, due to the low arithmetic intensity of its computational kernels. To reduce the amount of data movement, and thus, to improve performance, we investigated the effect of using a mix of single and double precision while retaining double-precision accuracy. Previous efforts have explored reduced precision in the preconditioner, but the use of reduced precision in the solver itself has received limited attention. We found that GMRES only needs double precision in computing the residual and updating the approximate solution to achieve double-precision accuracy, although it must restart after each improvement of single-precision accuracy. This finding holds for the tested orthogonalization schemes: Modified Gram-Schmidt (MGS) and Classical Gram-Schmidt with Re-orthogonalization (CGSR). Furthermore, our mixed-precision GMRES, when restarted at least once, performed 19% and 24% faster on average than double-precision GMRES for MGS and CGSR, respectively. Our implementation uses generic programming techniques to ease the burden of coding implementations for different data types. Our use of the Kokkos library allowed us to exploit parallelism and optimize data management. Additionally, KokkosKernels was used when producing performance results. In conclusion, using a mix of single and double precision in GMRES can improve performance while retaining double-precision accuracy.

[1]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[2]  Miroslav Rozlozník,et al.  Modified Gram-Schmidt (MGS), Least Squares, and Backward Stability of MGS-GMRES , 2006, SIAM J. Matrix Anal. Appl..

[3]  Hartwig Anzt,et al.  A Modular Precision Format for Decoupling Arithmetic Format and Storage Format , 2018, Euro-Par Workshops.

[4]  H. V. D. Vorst,et al.  The superlinear convergence behaviour of GMRES , 1993 .

[5]  Zdenek Strakos,et al.  Residual and Backward Error Bounds in Minimum Residual Krylov Subspace Methods , 2001, SIAM J. Sci. Comput..

[6]  Elizabeth R. Jessup,et al.  On improving the performance of the linear solver restarted gmres , 2003 .

[7]  Julien Langou,et al.  Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems , 2007, Int. J. High Perform. Comput. Appl..

[8]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[9]  Serge Gratton,et al.  Exploiting variable precision in GMRES , 2019, ArXiv.

[10]  M. Rozložník,et al.  The loss of orthogonality in the Gram-Schmidt orthogonalization process , 2005 .

[11]  Peter Lindstrom,et al.  Fixed-Rate Compressed Floating-Point Arrays , 2014, IEEE Transactions on Visualization and Computer Graphics.

[12]  Daniel Sunderland,et al.  Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..

[13]  Layne T. Watson,et al.  Mixed-Precision Preconditioners in Parallel Domain Decomposition Solvers , 2008 .

[14]  John L. Gustafson,et al.  Beating Floating Point at its Own Game: Posit Arithmetic , 2017, Supercomput. Front. Innov..

[15]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[16]  Julien Langou,et al.  Accelerating scientific computations with mixed precision algorithms , 2008, Comput. Phys. Commun..

[17]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[18]  Xiaomei Yang Rounding Errors in Algebraic Processes , 1964, Nature.

[19]  Franck Cappello,et al.  Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[20]  Elizabeth R. Jessup,et al.  A Technique for Accelerating the Convergence of Restarted GMRES , 2005, SIAM J. Matrix Anal. Appl..

[21]  Hartwig Anzt,et al.  Mixed Precision Iterative Refinement Methods for Linear Systems: Convergence Analysis Based on Krylov Subspace Methods , 2010, PARA.

[22]  M. Rozložník,et al.  Numerical behaviour of the modified gram-schmidt GMRES implementation , 1997 .

[23]  Nicholas J. Higham,et al.  A New Analysis of Iterative Refinement and Its Application to Accurate Solution of Ill-Conditioned Sparse Linear Systems , 2017, SIAM J. Sci. Comput..

[24]  Christopher C. Paige,et al.  A Useful Form of Unitary Matrix Obtained from Any Sequence of Unit 2-Norm n-Vectors , 2009, SIAM J. Matrix Anal. Appl..