Data-Driven Scenario-Based Application Mapping for Heterogeneous Many-Core Systems

For applications whose workload and execution behavior significantly varies with the input, a single mapping of application tasks to a given target architecture is insufficient. A single mapping may deliver a high-quality solution for the average case but rarely exploits the specific execution behavior of concurrent tasks triggered by each input tuple. E.g., tasks with higher computational demands under certain input should be mapped onto high-performance resources of the heterogeneous architecture. This necessitates mappings that are specialized for specific input data. Yet, due to the large size of input combinations, determining a separate optimized mapping for each individual input workload is not feasible for most applications. As a remedy, we propose to group input data with similar execution characteristics into a selected, small number of so-called workload scenarios for which we supply optimized mappings. In this paper, we provide a data-driven approach for detecting workload scenarios and exploring scenario-optimized mappings based on a collection of input data. The identification of scenarios and the determination of optimized mappings are interdependent: For the data-driven identification of workload scenarios, we have to measure the profiles when executing the application with the given input data for different application mappings. However, to come up with scenario-optimized application mappings, the workload scenarios have to be known. We tackle this interdependence problem by proposing a cyclic design methodology that optimizes both aspects in an iterative fashion. It is shown that with our approach, the latency of two exemplary applications, a ray tracing as well as an image stitching application, can be significantly improved compared to methods that ignore workload scenarios or do not perform the proposed iterative refinement. Furthermore, we demonstrate that our proposal can be used in the context of a hybrid application mapping methodology.

[1]  Michael Felsberg,et al.  Image Alignment for Panorama Stitching in Sparsely Structured Environments , 2015, SCIA.

[2]  Andy D. Pimentel,et al.  A trace-based scenario database for high-level simulation of multimedia MP-SoCs , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[3]  Michael Glaß,et al.  Hard real-time application mapping reconfiguration for NoC-based many-core systems , 2019, Real-Time Systems.

[4]  Lothar Thiele,et al.  Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case Study , 1998, PPSN.

[5]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[6]  Henk Corporaal,et al.  Automated extraction of scenario sequences from disciplined dataflow networks , 2013, 2013 Eleventh ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2013).

[7]  Michael Glaß,et al.  ActorX10: an actor library for X10 , 2016, X10@PLDI.

[8]  Henk Corporaal,et al.  System-scenario-based design of dynamic embedded systems , 2009, TODE.

[9]  Arthur Appel,et al.  Some techniques for shading machine renderings of solids , 1968, AFIPS Spring Joint Computing Conference.

[10]  Michael Glaß,et al.  DAARM: Design-time application analysis and run-time mapping for predictable execution in many-core systems , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[11]  Jürgen Teich,et al.  System-Level Synthesis Using Evolutionary Algorithms , 1998, Des. Autom. Embed. Syst..

[12]  Wei Quan,et al.  A Hybrid Task Mapping Algorithm for Heterogeneous MPSoCs , 2015, ACM Trans. Embed. Comput. Syst..

[13]  Andy D. Pimentel,et al.  Scenario-based design space exploration of MPSoCs , 2010, 2010 IEEE International Conference on Computer Design.

[14]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[15]  Martin Lukasiewycz,et al.  Opt4J: a modular framework for meta-heuristic optimization , 2011, GECCO '11.

[16]  Sander Stuijk,et al.  Automatic scenario detection for improved WCET estimation , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[17]  Chih-Hsien Hsia,et al.  Panoramic image stitching system for automotive applications , 2014, 2014 IEEE International Conference on Consumer Electronics - Taiwan.

[18]  Reinhard Männer,et al.  VIRIM: A Massively Parallel Processor for Real-Time Volume Visualization in Medicine , 1994, Workshop on Graphics Hardware.

[19]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[20]  Jürgen Teich,et al.  Execution-driven parallel simulation of PGAS applications on heterogeneous tiled architectures , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).