IMPLEMENTING AN INTERIOR POINT METHOD FOR LINEAR PROGRAMS ON A CPU-GPU SYSTEM

Graphics processing units (GPUs), present in every laptop and desktop computer, are potentially pow- erful computational engines for solving numerical problems. We present a mixed precision CPU-GPU algorithm for solving linear programming problems using interior point methods. This algorithm, based on the rectangular-packed matrix storage scheme of Gunnels and Gustavson, uses the GPU for computationally intensive tasks such as ma- trix assembly, Cholesky factorization, and forward and back substitution. Comparisons with a CPU implementation demonstrate that we can improve performance by using the GPU for sufficiently large problems. Since GPU archi- tectures and programming languages are rapidly evolving, we expect that GPUs will be an increasingly attractive tool for matrix computation in the future.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Franklin T. Luk,et al.  Computing the Singular-Value Decomposition on the ILLIAC IV , 1980, TOMS.

[3]  V. Rich Personal communication , 1989, Nature.

[4]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[5]  Barry W. Peyton,et al.  A Supernodal Cholesky Factorization Algorithm for Shared-Memory Multiprocessors , 1991, SIAM J. Sci. Comput..

[6]  Dennis Ritchie,et al.  The development of the C language , 1993, HOPL-II.

[7]  Tom Davis,et al.  Opengl programming guide: the official guide to learning opengl , 1993 .

[8]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[9]  Yin Zhang,et al.  Solving large-scale linear programs by interior-point methods under the Matlab ∗ Environment † , 1998 .

[10]  Kenneth Moreland,et al.  The FFT on a GPU , 2003, HWWS '03.

[11]  William R. Mark,et al.  Cg: a system for programming graphics hardware in a C-like language , 2003, ACM Trans. Graph..

[12]  Eitan Grinspun,et al.  Sparse matrix solvers on the GPU: conjugate gradients and multigrid , 2003, SIGGRAPH Courses.

[13]  Jens H. Krüger,et al.  GPGPU: general purpose computation on graphics hardware , 2004, SIGGRAPH '04.

[14]  John A. Gunnels,et al.  A New Array Format for Symmetric and Triangular Matrices , 2004, PARA.

[15]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..

[16]  Randi J. Rost OpenGL shading language , 2004 .

[17]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[18]  R. C. Whaley,et al.  Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005, Softw. Pract. Exp..

[19]  John D. Owens,et al.  General Purpose Computation on Graphics Hardware , 2005, IEEE Visualization.

[20]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[21]  Antoine Petitet,et al.  Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005 .

[22]  Rüdiger Westermann,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, SIGGRAPH Courses.

[23]  Tien-Tsin Wong,et al.  Parallel evolutionary algorithms on graphics processing unit , 2005, 2005 IEEE Congress on Evolutionary Computation.

[24]  Dinesh Manocha,et al.  LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[25]  Pierre-Antoine Absil,et al.  Constraint Reduction for Linear Programs with Many Inequality Constraints , 2006, SIAM J. Optim..

[26]  D. O’Leary,et al.  Exploiting Structure of Symmetric or Triangular Matrices on a GPU , 2008 .

[27]  Nathan A. Carr,et al.  Cache and bandwidth aware matrix multiplication on the GPU , 2010 .