Relations between Galerkin and Norm-Minimizing Iterative Methods for Solving Linear Systems

Several iterative methods for solving linear systems $Ax=b$ first construct a basis for a Krylov subspace and then use the basis vectors, together with the Hessenberg (or tridiagonal) matrix generated during that construction, to obtain an approximate solution to the linear system. To determine the approximate solution, it is necessary to solve either a linear system with the Hessenberg matrix as coefficient matrix or an extended Hessenberg least squares problem. In the first case, referred to as a Galerkin method, the residual is orthogonal to the Krylov subspace, whereas in the second case, referred to as a norm-minimizing method, the residual (or a related quantity) is minimized over the Krylov subspace. Examples of such pairs include the full orthogonalization method (FOM) (Arnoldi) and generalized minimal residual (GMRES) algorithms, the biconjugate gradient (BCG) and quasi-minimal residual (QMR) algorithms, and their symmetric equivalents, the Lanczos and minimal residual (MINRES) algorithms. A relationship between the solution of the linear system and that of the least squares problem is used to relate the residual norms in Galerkin processes to the norms of the quantities minimized in the corresponding norm-minimizing processes. It is shown that when the norm-minimizing process is converging rapidly, the residual norms in the corresponding Galerkin process exhibit similar behavior, whereas when the norm-minimizing process is converging very slowly, the residual norms in the corresponding Galerkin process are significantly larger. This is a generalization of the relationship established between Arnoldi and GMRES residual norms in P. N. Brown, A theoretical comparison of the Arnoldi and GMRES algorithms, SIAM J. Sci. Statist. Comput., 12, 1991, pp. 58--78. For MINRES and Lanczos, and for two nonsymmetric bidiagonalization procedures, we extend the arguments to incorporate the effects of finite precision arithmetic.