A Recursive Blocked Schur Algorithm for Computing the Matrix Square Root

The Schur method for computing a matrix square root reduces the matrix to the Schur triangular form and then computes a square root of the triangular matrix. We show that by using a recursive blocking technique the computation of the square root of the triangular matrix can be made rich in matrix multiplication. Numerical experiments making appropriate use of level 3 BLAS show significant speedups over the point algorithm, both in the square root phase and in the algorithm as a whole. The excellent numerical stability of the point algorithm is shown to be preserved by recursive blocking. These results are extended to the real Schur method. Recursive blocking is also shown to be effective for multiplying triangular matrices.