High-performance up-and-downdating via householder-like transformations

We present high-performance algorithms for up-and-downdating a Cholesky factor or QR factorization. The method uses Householder-like transformations, sometimes called hyperbolic Householder transformations, that are accumulated so that most computation can be cast in terms of high-performance matrix-matrix operations. The resulting algorithms can then be used as building blocks for an algorithm-by-blocks that allows computation to be conveniently scheduled to multithreaded architectures like multicore processors. Performance is shown to be similar to that achieved by a blocked QR factorization via Householder transformations.

[1]  G. Stewart,et al.  On Hyperbolic Triangularization: Stability and Pivoting , 1998, SIAM J. Matrix Anal. Appl..

[2]  Xiaobai Sun,et al.  Aggregations of Elementary Transformations , 1996 .

[3]  Robert A. van de Geijn,et al.  Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).

[4]  Robert A. van de Geijn,et al.  Parallel out-of-core computation and updating of the QR factorization , 2005, TOMS.

[5]  Robert A. van de Geijn,et al.  Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures , 2007, SPAA '07.

[6]  Jack Dongarra,et al.  Parallel tiled QR factorization for multicore architectures , 2008 .

[7]  Robert A. van de Geijn,et al.  Updating an LU Factorization with Pivoting , 2008, TOMS.

[8]  Kuo-Liang Chung,et al.  A block representation for products of hyperbolic householder transform , 1997 .

[9]  Robert A. van de Geijn,et al.  SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks , 2008, PPoPP.

[10]  C. Puglisi Modification of the householder method based on the compact WY representation , 1992 .

[11]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[12]  Christian H. Bischof,et al.  The WY representation for products of householder matrices , 1985, PPSC.

[13]  H. Walker Implementation of the GMRES method using householder transformations , 1988 .

[14]  Tze Meng Low,et al.  Accumulating Householder transformations, revisited , 2006, TOMS.

[15]  Robert A. van de Geijn,et al.  Programming matrix algorithms-by-blocks for thread-level parallelism , 2009, TOMS.

[16]  C. Loan,et al.  A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[17]  Julien Langou,et al.  Parallel tiled QR factorization for multicore architectures , 2007, Concurr. Comput. Pract. Exp..