Distributed Implementation of OpenMP Based on Checkpointing Aided Parallel Execution

Checkpointers are used to secure the execution of sequential and parallel programs. This article shows that they can also be used to generate a parallel program from a sequential program automatically, this program being executed on any kind of distributed parallel system. The article also presents how this new technique can be included inside the usual compilation chain to provide a distributed implementation of OpenMP. Finally, some performance measurements are discussed.

[1]  Alain Darte,et al.  The Data Parallel Programming Model , 1996 .

[2]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[3]  Bo Leuf Peer to Peer , 2002 .

[4]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[5]  Paul Feautrier,et al.  Automatic Parallelization in the Polytope Model , 1996, The Data Parallel Programming Model.

[6]  Dan Nagle,et al.  MPI -- The Complete Reference, Vol. 1, The MPI Core, 2nd ed., Scientific and Engineering Computation Series, by Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker and Jack Dongarra , 2005 .

[7]  Barbara M. Chapman,et al.  Towards a more efficient implementation of OpenMP for clusters via translation to global arrays , 2005, Parallel Comput..

[8]  Denis Barthou,et al.  On the Recognition of Algorithm Templates , 2004, COCV@ETAPS.

[9]  David B. Loveman High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[10]  Rajkumar Buyya,et al.  High Performance Cluster Computing: Architectures and Systems , 1999 .

[11]  D J Evans,et al.  Parallel processing , 1986 .

[12]  Ken Kennedy,et al.  Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.

[13]  Mats Brorsson,et al.  A Fully Compliant OpenMP Implementationon Software Distributed Shared Memory , 2002, HiPC.

[14]  James S. Plank,et al.  An Overview of Checkpointing in Uniprocessor and DistributedSystems, Focusing on Implementation and Performance , 1997 .

[15]  Bo Leuf,et al.  Peer to Peer: Collaboration and Sharing over the Internet , 2002 .

[16]  Paul Feautrier,et al.  On the Equivalence of Two Systems of Affine Recurrence Equations (Research Note) , 2002, Euro-Par.

[17]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[18]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[19]  M.I.T. Press,et al.  The International Journal of Supercomputer Applications and High Performance Computing— , 1994 .

[20]  Gilles Fedak,et al.  XtremWeb: a generic global computing system , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[21]  Viktor K. Prasanna,et al.  High Performance Computing — HiPC 2002 , 2002, Lecture Notes in Computer Science.

[22]  Rudolf Eigenmann,et al.  Towards automatic translation of OpenMP to MPI , 2005, ICS '05.

[23]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[24]  Jason Nieh,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation , 2022 .

[25]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.