Towards a new tuple-based programming paradigm for expressing and optimizing irregular parallel computations

Irregular computations have the inherent property of being hard to automatically optimize and parallelize. In this paper, a new tuple-based programming paradigm is described for expressing irregular computations. At the basis, this programming paradigm allows irregular computations to be specified on an elementary data entry (tuple) level rather than on (complicated) data structures. As a consequence the actual data structures are being constructed during the code generation phase. Using this framework not only current implementations of irregular computations in for instance the C programming language can be automatically mapped into the tuple-based programming model, but also the code generated from this specification is competitive with hand-optimized codes. The potential of this approach is demonstrated on two representative applications: sparse triangular solve to represent sparse linear algebra and an implementation of the Bellman-Ford algorithm to represent graph algorithms. We demonstrate that from an ordinary triangular solve code, parallelized implementations can be automatically generated that up till now could only be derived by hand. We show that the performance of these automatically generated implementations is comparable with the performance of hand-optimized triangular solvers. For the Bellman-Ford algorithm initial experiments have been conducted which show that the derived GPU implementations of this algorithm achieve speedups in execution time of two to four orders of magnitude compared to the initial implementation.