Implementing QR factorization updating algorithms on GPUs

Linear least squares problems are commonly solved by QR factorization. When multiple solutions need to be computed with only minor changes in the underlying data, knowledge of the difference between the old data set and the new can be used to update an existing factorization at reduced computational cost. We investigate the viability of implementing QR updating algorithms on GPUs and demonstrate that GPU-based updating for removing columns achieves speed-ups of up to 13.5x compared with full GPU QR factorization. We characterize the conditions under which other types of updates also achieve speed-ups.

[1]  Erricos John Kontoghiorghes,et al.  Parallel algorithms for downdating the least squares estimator of the regression model , 2008, Parallel Comput..

[2]  Erricos John Kontoghiorghes,et al.  Parallel Algorithms for Linear Models , 2012 .

[3]  Emmanuel Agullo,et al.  QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[4]  Mark A. Richards,et al.  QR decomposition on GPUs , 2009, GPGPU-2.

[5]  Eric J. Kelmelis,et al.  CULA: hybrid GPU accelerated linear algebra routines , 2010, Defense + Commercial Sensing.

[6]  James Demmel,et al.  Communication-Avoiding QR Decomposition for GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[7]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[8]  A. Srinivasan Givens and Householder Reductions for Linear Least Squares on aCluster of Workstations , 2007 .

[9]  Erricos John Kontoghiorghes,et al.  Efficient algorithms for block downdating of least squares solutions , 2004 .

[10]  Jack J. Dongarra,et al.  An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs , 2010, PARA.

[11]  Erricos John Kontoghiorghes,et al.  Parallel Algorithms for Linear Models: Numerical Methods and Estimation Problems , 2000 .

[12]  Sven Hammarling,et al.  Updating the QR factorization and the least squares problem , 2008 .

[13]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[14]  Gene H. Golub,et al.  Matrix computations , 1983 .

[15]  Robert Andrew,et al.  Implementation of QR Updating Algorithms on the GPU , 2012 .

[16]  Robert A. van de Geijn,et al.  Parallel out-of-core computation and updating of the QR factorization , 2005, TOMS.