Matrix inversion on CPU–GPU platforms with applications in control theory

In this paper, we tackle the inversion of large‐scale dense matrices via conventional matrix factorizations (LU, Cholesky, and LDLT) and the Gauss–Jordan method on hybrid platforms consisting of a multicore CPU and a many‐core graphics processor (GPU). Specifically, we introduce the different matrix inversion algorithms by using a unified framework based on the notation from the FLAME project; we develop hybrid implementations for those matrix operations underlying the algorithms, alternative to those in existing libraries for single GPU systems; and we perform an extensive experimental study on a platform equipped with state‐of‐the‐art general‐purpose architectures from Intel (Santa Clara, CA, USA) and a ‘Fermi’ GPU from NVIDIA (Santa Clara, CA, USA) that exposes the efficiency of the different inversion approaches. Our study and experimental results show the simplicity and performance advantage of the Gauss–Jordan elimination‐based inversion methods and the difficulties associated with the symmetric indefinite case. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  Judith Gardiner,et al.  A generalization of the matrix sign function solution for algebraic Riccati equations , 1985, 1985 24th IEEE Conference on Decision and Control.

[2]  Enrique S. Quintana-Ortí,et al.  Using graphics processors to accelerate the computation of the matrix inverse , 2011, The Journal of Supercomputing.

[3]  Robert A. van de Geijn,et al.  The science of deriving dense linear algebra algorithms , 2005, TOMS.

[4]  J. D. Roberts,et al.  Linear model reduction and solution of the algebraic Riccati equation by use of the sign function , 1980 .

[5]  Enrique S. Quintana-Ortí,et al.  Parallele Numerische Simulation Für Physik Und Kontinuumsmechanik Solving Linear-quadratic Optimal Control Problems on Parallel Computers Preprintreihe Des Chemnitzer Sfb 393 , 2022 .

[6]  Athanasios C. Antoulas,et al.  Approximation of Large-Scale Dynamical Systems , 2005, Advances in Design and Control.

[7]  P. Strazdins A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization , 1998 .

[8]  Enrique S. Quintana-Ortí,et al.  Accelerating Model Reduction of Large Linear Systems with Graphics Processors , 2010, PARA.

[9]  Enrique S. Quintana-Ortí,et al.  High performance matrix inversion of SPD matrices on graphics processors , 2011, 2011 International Conference on High Performance Computing & Simulation.

[10]  Enrique S. Quintana-Ortí,et al.  Exploiting the capabilities of modern GPUs for dense matrix computations , 2009, Concurr. Comput. Pract. Exp..

[11]  Robert A. van de Geijn,et al.  FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.

[12]  Leiba Rodman,et al.  Algebraic Riccati equations , 1995 .

[13]  BennerPeter,et al.  A mixed-precision algorithm for the solution of Lyapunov equations on hybrid CPU-GPU platforms , 2011, ParCo 2011.

[14]  Gene H. Golub,et al.  Matrix computations , 1983 .

[15]  Golub Gene H. Et.Al Matrix Computations, 3rd Edition , 2007 .

[16]  Robert A. van de Geijn,et al.  A Note On Parallel Matrix Inversion , 2000, SIAM J. Sci. Comput..

[17]  Robert A. van de Geijn,et al.  Families of algorithms related to the inversion of a Symmetric Positive Definite matrix , 2008, TOMS.

[18]  Harald K. Wimmer,et al.  On the algebraic Riccati equation , 1976, Bulletin of the Australian Mathematical Society.

[19]  Enrique S. Quintana-Ortí,et al.  State-space truncation methods for parallel model reduction of large-scale systems , 2003, Parallel Comput..