Formal Semantics Applied to the Implementation of a Skeleton-Based Parallel Programming Library

In a previous paper1, we described QUAFF, a skeleton-based parallel programming library which main originality is to rely on C++ template meta-programming2,3 techniques to significantly reduce the overhead traditionally associated with object-oriented implementations of such libraries. The basic idea is to use the C++ template mechanism so that skeleton-based programs are actually run at compile-time and generate a new C+MPI code to be compiled and executed at run-time. The implementation mechanism supporting this compile-time approach to skeleton-based parallel programming was only sketched mainly because the operational semantics of the skeletons were not stated in a formal way, but “hardwired” in a set of complex meta-programs. As a result, changing this semantics or adding a new skeleton was difficult. In this paper, we give a formal model for the QUAFF skeleton system, describe how this model can efficiently be implemented using C++ meta-programming techniques and show how this helps overcoming the aforementioned difficulties. It relies on three formally defined stages. First, the C++ compiler generates an abstract syntax tree representing the parallel structure of the application, from the high-level C++ skeletal program source. Then, this tree is turned into an abstract process network by means of a set of production rules; this process network encodes, in a platform-independent way, the communication topology and, for each node, the scheduling of communications and computations. Finally the process network is translated into C+MPI code. By contrast to the previous QUAFF implementation, the process network now plays the role of an explicit intermediate representation. Adding a new skeleton now only requires giving the set of production rules for expanding the corresponding tree node into a process sub-network. The paper is organized as follows. Section 2 briefly recalls the main features of the QUAFF programming model. Section 3 presents the formal model we defined to turn a skeleton abstract syntax tree into a process network. Section 4 shows how template meta-programming is used to implement this model. We conclude with experimental results for this new implementation (section 5) and a brief review of related work (section 6).