Predicting Performance and Energy Efficiency for Large-Scale Parallel Applications on Highly Heterogeneous Platforms

Predicting the performance of parallel programs for large-scale parallel platforms is challenging due to the disparity between development system and target platform. This is even worse now that energy efficiency is a universal concern and platforms move towards highly heterogeneous reconfigurable systems containing GPUs, FPGAs, and other unconventional processing elements. In this paper we present a simulative approach that predicts energy usage and performance of parallel software on large heterogeneous platforms. It simulates communication activity in detail while abstracting functional behaviour. This allows developers to quickly compare and optimise application designs, hardware configurations, and mapping alternatives even without a fully working target platform.

[1]  Oscar Almer,et al.  A Parallel Dynamic Binary Translator for Efficient Multi-Core Simulation , 2013, International Journal of Parallel Programming.

[2]  Rizos Sakellariou,et al.  Application Representations for Multiparadigm Performance Modeling of Large-Scale Parallel Scientific Codes , 2000, Int. J. High Perform. Comput. Appl..

[3]  Thomas Hérault,et al.  PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.

[4]  Mateo Valero,et al.  On the simulation of large-scale architectures using multiple application abstraction levels , 2012, TACO.

[5]  Michael Laurenzano,et al.  PSINS: An Open Source Event Tracer and Execution Simulator , 2009, 2009 DoD High Performance Computing Modernization Program Users Group Conference.

[6]  Ke Wang,et al.  SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascale , 2013, SpringSim.

[7]  Eduard Ayguadé,et al.  PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite , 2016, ACM Trans. Archit. Code Optim..

[8]  Mario Porrmann,et al.  A Scalable Server Architecture for Next-Generation Heterogeneous Compute Clusters , 2014, 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing.

[9]  James E. Smith,et al.  Advanced Micro Devices , 2005 .

[10]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[11]  Domenik Helms,et al.  Using Early Power and Timing Estimations of Massively Heterogeneous Computation Platforms to Create Optimized HPC Applications , 2014, 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing.

[12]  Helgi Adalsteinsson,et al.  Using simulation to design extremescale applications and architectures: programming model exploration , 2011, PERV.

[14]  Chantal Ykman-Couvreur,et al.  The COMPLEX reference framework for HW/SW co-design and power management supporting platform-based design-space exploration , 2013, Microprocess. Microsystems.