The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations

SOLAR is a portable high-perfonnance library for out-of-core dense matrix computations. It combines portability with high perfonnance by using existing high-perfonnance in-core subroutine libraries and by using an optimized matrix input-output library. SOLAR works on parallel computers, workstations, and personal computers. It supports in-core computations on both shared-memory and distributed-memory machines, and its matrix input-output library supports both conventional 1/0 interfaces and parallel 110 interfaces. This paper discusses the overall design of SOLAR, its interfaces, and the design of several important subroutines. Experimental results show that SOLAR can factor on a single workstation an out-of-core positive-definite symmetric matrix at a rate exceeding 215 Mflops, and an out-of-core general matrix at a rate exceeding 195 Mflops. Less than 16% of the running time is spent on 110 in these computations. These results indicate that SOLAR's portability does not compromise its perfonnance. We expect that the combination of portability, modularity, and the use of a high-level 110 interface will make the library an important platfonn for research on out-of-core algorithms and on parallel 110.

[1]  Jack J. Dongarra,et al.  Key Concepts for Parallel Out-of-Core LU Factorization , 1996, Parallel Comput..

[2]  Hans Riesel,et al.  A note on large linear systems , 1956 .

[3]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[4]  D. S. Scott Out of core dense solvers on Intel parallel supercomputers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[5]  John K. Reid,et al.  Solving Large Full Sets of Linear Equations in a Paged Virtual Store , 1981, TOMS.

[6]  David Kotz,et al.  Integrating Theory and Practice in Parallel File Systems , 1993 .

[7]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[8]  Horst D. Simon,et al.  Solution of large, dense symmetric generalized eigenvalue problems using secondary storage , 1988, TOMS.

[9]  Jaeyoung Choi,et al.  A Proposal for a Set of Parallel Basic Linear Algebra Subprograms , 1995, PARA.

[10]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[11]  Peter R. Cappello,et al.  Implementing the 3D Alternating Direction Method on the Hypercube , 1994, J. Parallel Distributed Comput..

[12]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[13]  D. W. Barron,et al.  Solution of Simultaneous Linear Equations using a Magnetic-Tape Store , 1960, Computer/law journal.

[14]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[15]  L. J. Comrie,et al.  Mathematical Tables and Other Aids to Computation. , 1946 .

[16]  Tilak Agerwala,et al.  SP2 System Architecture , 1999, IBM Syst. J..

[17]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[18]  Dror G. Feitelson,et al.  Parallel File Systems for the IBM SP Computers , 1995, IBM Syst. J..

[19]  Megan Sorenson,et al.  Library , 1958 .

[20]  S. Lennart Johnsson,et al.  Load-Balanced LU and QR Factor and Solve Routines for Scalable Processors with Scalable I/O , 1994 .

[21]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[22]  Sivan Toledo Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..

[23]  Rajeev Thakur,et al.  An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays , 1996, Sci. Program..

[24]  Nils A. Nieukwejaar Galley: a new parallel file system for scientific applications , 1997 .

[25]  David Kotz Disk-directed I/O for an out-of-core computation , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[26]  M. M. Stabrowski A block equation solver for large unsymmetric linear equation systems with dense coefficient matrices , 1987 .

[27]  J. D. Rutledge,et al.  High order matrix computations on the UNIVAC , 1952, ACM '52.

[28]  David S. Greenberg,et al.  Beyond core: Making parallel computer I/O practical , 1993 .

[29]  Robert A. van de Geijn,et al.  Anatomy of a Parallel Out-of-Core Dense Linear Solver , 1995, ICPP.