Transforming the Adaptive Irregular Out-of-Core Applications for Hiding Communication and Disk I/O

In adaptive irregular out-of-core applications, communications and mass disk I/O operations occupy a large portion of the overall execution. This paper presents a program transformation scheme to enable overlap of communication, computation and disk I/O in this kind of applications. We take programs in inspector-executor model as starting point, and transform them to a pipeline fashion. By decomposing the inspector phase and reordering iterations, more overlap opportunities are efficiently utilized. In the experiments, our techniques are applied to two important applications i.e. Partial differential equation solver and Molecular dynamics problems. For these applications, versions employing our techniques are almost 30% faster than inspector-executor versions.

[1]  Jan van Leeuwen,et al.  Computer Science Today , 1995, Lecture Notes in Computer Science.

[2]  Rudolf Eigenmann,et al.  Optimizing irregular shared-memory applications for distributed-memory systems , 2006, PPoPP '06.

[3]  Duncan H. Lawrie,et al.  On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations , 1981, IEEE Transactions on Computers.

[4]  Janak H. Patel,et al.  Compiler directed memory management policy for numerical programs , 1985, SOSP 1985.

[5]  G. Lonsdale,et al.  HPF+ investigations with crash-simulation kernels , 1997, Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228).

[6]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[7]  Peter Brezany,et al.  Irregular and Out-of-Core Parallel Computing on Clusters , 2001, PPAM.

[8]  Harry Berryman,et al.  Parallel Loops on Distributed Machines , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[9]  Chau-Wen Tseng,et al.  Improving Locality for Adaptive Irregular Scientific Codes , 2000, LCPC.

[10]  Edward G. Coffman,et al.  Organizing matrices and matrix operations for paged memory systems , 1969, Commun. ACM.

[11]  Rajeev Thakur,et al.  Compiler and runtime support for out-of-core HPF programs , 1994, ICS '94.

[12]  D. Martin Swany,et al.  Transformations to Parallel Codes for Communication-Computation Overlap , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[13]  G. Liu,et al.  Overlap of Computation and Communication on Shared-Memory , 1999, Scalable Comput. Pract. Exp..

[14]  Thomas H. Cormen,et al.  ViC*: A Preprocessor for Virtual-Memory C* , 1994 .

[15]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .