Disk resident arrays: an array-oriented I/O library for out-of-core computations

In out-of-core computations, disk storage is treated as another level in the memory hierarchy, below cache, local memory, and (in a parallel computer) remote memories. However the tools used to manage this storage are typically quite different from those used to manage access to local and remote memory. This disparity complicates implementation of out-of-core algorithms and hinders portability. We describe a programming model that addresses this problem. This model allows parallel programs to use essentially the same mechanisms to manage the movement of data between any two adjacent levels in a hierarchical memory system. We take as our starting point the Global Arrays shared-memory model and library, which support a variety of operations on distributed arrays, including transfer between local and remote memories. We show how this model can be extended to support explicit transfer between global memory and secondary storage, and we define a Disk Resident Arrays Library that supports such transfers. We illustrate the utility of the resulting model with two applications, an out-of-core matrix multiplication and a large computational chemistry program. We also describe implementation techniques on several parallel computers and present experimental results that demonstrate that the Disk Resident Arrays model can be implemented very efficiently on parallel computers.

[1]  Marianne Winslett,et al.  Server-Directed Collective I/O in Panda , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[2]  Elizabeth Shriver,et al.  An API for Choreographing Data Accesses , 1995 .

[3]  David J. DeWitt,et al.  SPIFFI-A Scalable Parallel File System for the Intel Paragon , 1996, IEEE Trans. Parallel Distributed Syst..

[4]  Robert A. van de Geijn,et al.  SUMMA: Scalable Universal Matrix Multiplication Algorithm , 1995 .

[5]  Ken Kennedy,et al.  A model and compilation strategy for out-of-core data parallel programs , 1995, PPOPP '95.

[6]  Mahmut T. Kandemir,et al.  Data access reorganizations in compiling out-of-core data parallel programs on distributed memory machines , 1997, Proceedings 11th International Parallel Processing Symposium.

[7]  E. F. D'Azevedo,et al.  DONIO: Distributed Object Network I/O Library , 1994 .

[8]  D. Bernholdt,et al.  Large-scale correlated electronic structure calculations: the RI-MP2 method on parallel computers , 1996 .

[9]  Robert W. Floyd,et al.  Permuting Information in Idealized Two-Level Storage , 1972, Complexity of Computer Computations.

[10]  Nicholas Carriero,et al.  Linda in context , 1989, CACM.

[11]  H. Siegel,et al.  Parallel Processing of Spaceborne Imaging Radar Data , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[12]  M. Winslett,et al.  A data management approach for handling large compressed arrays in high performance computing , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[13]  David Kotz Disk-directed I/O for an out-of-core computation , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[14]  Rajeev Thakur,et al.  I/O characterization of a portable astrophysics application on the IBM SP and Intel Paragon , 1996 .

[15]  Robert J. Harrison,et al.  Global Arrays: a portable "shared-memory" programming model for distributed memory computers , 1994, Proceedings of Supercomputing '94.

[16]  ' E.F.D,et al.  DONIO : Distributed Object Network I / O Library , 1994 .

[17]  Robert A. van de Geijn,et al.  SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..

[18]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[19]  D. G. Feitelson,et al.  Parallel access to files in the Vesta file system , 1993, Supercomputing '93.

[20]  Robert J. Harrison,et al.  A scalable implementation of RI-SCF on parallel computers , 1996 .

[21]  Robert J. Harrison,et al.  An implementation of RI–SCF on parallel computers , 1997 .

[22]  R. Harrison,et al.  AB Initio Molecular Electronic Structure on Parallel Computers , 1994 .

[23]  Garth A. Gibson Issues arising in the SIO-OS Low-Level PFS API , 1995 .

[24]  Rajeev Thakur,et al.  Compilation of out-of-core data parallel programs for distributed memory machines , 1994, CARN.

[25]  Dror G. Feitelson,et al.  Overview of the MPI-IO Parallel I/O Interface , 1996, Input/Output in Parallel and Distributed Computer Systems.

[26]  Dror G. Feitelson,et al.  Parallel access to files in the Vesta file system , 1993, Supercomputing '93. Proceedings.

[27]  Amin Vahdat,et al.  Tools for the development of application-specific virtual memory management , 1993, OOPSLA '93.

[28]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[29]  D.A. Reed,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[30]  Thomas H. Cormen,et al.  ViC*: A Preprocessor for Virtual-Memory C* , 1994 .

[31]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[32]  David E. Bernholdt,et al.  Parallel computational chemistry made easier: The development of NWChem , 1995 .

[33]  Carl Kesselman,et al.  The Nexus task-parallel runtime system , 1994 .