The PEPPHER Composition Tool: Performance-Aware Dynamic Composition of Applications for GPU-Based Systems

The PEPPHER component model defines an environment for annotation of native C/C++ based components for homogeneous and heterogeneous multicore and manycore systems, including GPU and multi-GPU based systems. For the same computational functionality, captured as a component, different sequential and explicitly parallel implementation variants using various types of execution units might be provided, together with metadata such as explicitly exposed tunable parameters. The goal is to compose an application from its components and variants such that, depending on the run-time context, the most suitable implementation variant will be chosen automatically for each invocation. We describe and evaluate the PEPPHER composition tool, which explores the application's components and their implementation variants, generates the necessary low-level code that interacts with the runtime system, and coordinates the native compilation and linking of the various code units to compose the overall application code. With several applications, we demonstrate how the composition tool provides a high-level programming front-end while effectively utilizing the task-based PEPPHER runtime system (StarPU) underneath.

[1]  Thomas Rauber,et al.  Optimizing locality and scalability of embedded Runge-Kutta solvers using block-based pipelining , 2006, J. Parallel Distributed Comput..

[2]  Kim M. Hazelwood,et al.  Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[3]  Christoph W. Kessler,et al.  Adaptive Off-Line Tuning for Optimized Composition of Components for Heterogeneous Many-Core Systems , 2012, VECPAR.

[4]  Robert E. Park,et al.  Software Size Measurement: A Framework for Counting Source Statements , 1992 .

[5]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[6]  Kunle Olukotun,et al.  A domain-specific approach to heterogeneous parallelism , 2011, PPoPP '11.

[7]  Wolfgang Karl,et al.  Seamlessly portable applications: Managing the diversity of modern heterogeneous systems , 2012, TACO.

[8]  Cédric Augonnet,et al.  PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems , 2011, IEEE Micro.

[9]  Siegfried Benkner,et al.  Explicit Platform Descriptions for Heterogeneous Many-Core Architectures , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[10]  Christoph W. Kessler,et al.  Flexible Runtime Support for Efficient Skeleton Programming on Heterogeneous GPU-based Systems , 2011, PARCO.

[11]  Alan Edelman,et al.  PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.

[12]  Greg Stitt,et al.  Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing , 2010, LCTES '10.

[13]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[14]  Christoph W. Kessler,et al.  Optimized composition of performance‐aware parallel components , 2012, Concurr. Comput. Pract. Exp..

[15]  Andrei Alexandrescu,et al.  Modern C++ design: generic programming and design patterns applied , 2001 .

[16]  Cédric Augonnet,et al.  Scheduling Tasks over Multicore machines enhanced with acelerators: a Runtime System's Perspective , 2011 .

[17]  James Demmel,et al.  the Parallel Computing Landscape , 2022 .

[18]  Wolfgang Karl,et al.  Cost-aware function migration in heterogeneous systems , 2011, HiPEAC.

[19]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).