PASSION: Parallel And Scalable Software for Input-Output

We are developing a software system called PASSION: Parallel And Scalable Software for InputOutput which provides software support for high performance parallel I/O. PASSION provides support at the language, compiler, runtime as well as le system level. PASSION provides runtime procedures for parallel access to les (read/write), as well as for out-of-core computations. These routines can either be used together with a compiler to translate out-of-core data parallel programs written in a language like HPF, or used directly by application programmers. A number of optimizations such as Two-Phase Access, Data Sieving, Data Prefetching and Data Reuse have been incorporated in the PASSION Runtime Library for improved performance. PASSION also provides an initial framework for runtime support for out-of-core irregular problems. The goal of the PASSION compiler is to automatically translate outof-core data parallel programs to node programs for distributed memory machines, with calls to the PASSION Runtime Library. At the language level, PASSION suggests extensions to HPF for out-of-core programs. At the le system level, PASSION provides support for bu ering and prefetching data from disks. A portable parallel le system is also being developed as part of this project, which can be used across homogeneous or heterogeneous networks of workstations. PASSION also provides support for integrating data and task parallelism using parallel I/O techniques. We have used PASSION to implement a number of out-of-core applications such as a Laplace's equation solver, 2D FFT, Matrix Multiplication, LU Decomposition, image processing applications as well as unstructured mesh kernels in molecular dynamics and computational uid dynamics. We are currently in the process of using PASSION in applications in CFD (3D turbulent ows), molecular structure calculations, seismic computations, and earth and space science applications such as FourDimensional Data Assimilation. PASSION is currently available on the Intel Paragon, Touchstone Delta and iPSC/860. E orts are underway to port it to the IBM SP-1 and SP-2 using the Vesta Parallel File System. This work was supported in part by NSF Young Investigator Award CCR-9357840, grants from Intel SSD and IBM Corp., in part by USRA CESDIS Contract # 5555-26 and also in part by ARPA under contract # DABT63-91-C-0028. The content of the information does not necessarily re ect the position or the policy of the US Government and no o cial endorsement should be inferred. Rajeev Thakur is supported by a Syracuse University Graduate Fellowship. Michael Harry is supported by an ARPA Assert Fellowship. Rakesh Krishnaiyer is supported by the CASE Center, a NY State Advance Technology Center. This work was performed in part using the Intel Touchstone Delta and Paragon Systems operated by Caltech on behalf of the Concurrent Supercomputing Consortium. Access to this facility was provided by CRPC. This work was also performed in part using the Intel iPSC/860 and IBM SP-1/SP-2 at NPAC, the IBM SP-1 at Argonne National Laboratory and the Intel Paragon at the Jet Propulsion Laboratory.

[1]  Bhaven Avalaniy,et al.  Integrating Task and Data Parallelism Using Parallel I/o Techniques , 1994 .

[2]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[3]  Joel H. Saltz,et al.  Parallel preconditioned iterative methods for the compressible Navier-Stokes equations , 1990 .

[4]  Dror G. Feitelson,et al.  Overview of the Vesta parallel file system , 1993, CARN.

[5]  Joel H. Saltz,et al.  Distributed memory compiler methods for irregular problems—data copy reuse and runtime partitioning , 1992 .

[6]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[7]  Geoffrey C. Fox,et al.  Fortran 90D/HPF compiler for distributed memory MIMD computers: design, implementation, and performance results , 1993, Supercomputing '93.

[8]  Anoop Gupta,et al.  Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.

[9]  L. C. Smith PASSION Runtime Library for Parallel I/O , 1994 .

[10]  Carla Schlatter Ellis,et al.  Prefetching in File Systems for MIMD Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[11]  Dror G. Feitelson,et al.  Parallel access to files in the Vesta file system , 1993, Supercomputing '93. Proceedings.

[12]  Ian Foster,et al.  A compilation system that integrates High Performance Fortran and Fortran M , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[13]  Michael E. Wolf,et al.  Improving locality and parallelism in nested loops , 1992 .

[14]  Harry Berryman,et al.  A manual for PARTI runtime primitives , 1990 .

[15]  Ken Kennedy,et al.  Improving register allocation for subscripted variables , 1990, PLDI '90.

[16]  Thomas W. Crockett File concepts for parallel I/O , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[17]  Steven Mark Carr,et al.  Memory-hierarchy management , 1993 .

[18]  D. Mavriplis Three dimensional unstructured multigrid for the Euler equations , 1991 .

[19]  Rajeev Thakur,et al.  Compiler and runtime support for out-of-core HPF programs , 1994, ICS '94.

[20]  Michael Wolfe,et al.  Iteration Space Tiling for Memory Hierarchies , 1987, PPSC.

[21]  Alok N. Choudhary,et al.  The design of VIP-FS: a virtual, parallel file system for high performance parallel and distributed computing , 1995, OPSR.

[22]  Michael Wolfe,et al.  More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[23]  Joel H. Saltz,et al.  ICASE Report No . 92-12 / iVG / / ff 3 J / ICASE THE DESIGN AND IMPLEMENTATION OF A PARALLEL UNSTRUCTURED EULER SOLVER USING SOFTWARE PRIMITIVES , 2022 .

[24]  Ken Kennedy,et al.  Scalar replacement in the presence of conditional control flow , 1994, Softw. Pract. Exp..

[25]  Ravi Ponnusamy Run-time support and compilation methods for irregular computations on distributed memory parallel machines , 1996 .

[26]  Ken Kennedy,et al.  Blocking Linear Algebra Codes for Memory Hierarchies , 1989, PPSC.

[27]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[28]  Ken Kennedy,et al.  Compiler Analysis for Irregular Problems in Fortran D , 1992, LCPC.

[29]  Marcin Paprzycki,et al.  Parallel computing works! , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[30]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[31]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[32]  Jack Dongarra,et al.  Automatic Blocking of Nested Loops , 1990 .

[33]  Mahadev Satyanarayanan,et al.  A status report on research in transparent informed prefetching , 1993, OPSR.

[34]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[35]  Walid Abu-Sufah,et al.  Improving the performance of virtual memory computers. , 1979 .

[36]  E. DeBenedictis,et al.  nCUBE parallel I/O software , 1992, Eleventh Annual International Phoenix Conference on Computers and Communication [1992 Conference Proceedings].

[37]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[38]  Carla Schlatter Ellis,et al.  Bridge: a high performance file system for parallel processors , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[39]  Ken Kennedy,et al.  Software prefetching , 1991, ASPLOS IV.

[40]  H KatzRandy,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988 .

[41]  Alok N. Choudhary,et al.  Design and evaluation of primitives for parallel I/O , 1993, Supercomputing '93. Proceedings.

[42]  Joel H. Saltz,et al.  Principles of runtime support for parallel processors , 1988, ICS '88.

[43]  Steven A. Moyer,et al.  PIOUS: a scalable parallel I/O system for distributed computing environments , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[44]  Peter Brezany,et al.  Concurrent file operations in a high performance FORTRAN , 1992, Proceedings Supercomputing '92.