Performance-Aware Resource Management of Multi-Threaded Applications on Many-Core Systems

Modern computing systems employ a large number of processing elements leaving behind traditional design approaches and architectures. On the software side, this evolution in system architecture has driven rapid changes on the field of application development too, by increasing the usage of highly parallel/multi-threading and demanding applications. Thus, many-core systems raise the challenge of efficient resource management especially in cases where changes occur at run-time with a rapid pace. In this paper, a performance-aware resource management scheme for many-core architectures is presented. Particular, the developed framework takes as input parallel applications and performs an application profiling. Based on that profile information, a thread to core mapping algorithm finds (i) the appropriate number of threads that this application will have in order to maximize the utilization of the system; and (ii) the best mapping for maximizing the performance of the application. Experimental results showed that our mapping framework produces on average 23% and 18% better application turnaround time compared to another state-of-art run-time manager.

[1]  Iraklis Anagnostopoulos,et al.  A divide and conquer based distributed run-time mapping methodology for many-core platforms , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Dimitrios Soudris,et al.  Job-Arrival Aware Distributed Run-Time Resource Management on Intel SCC Manycore Platform , 2015, 2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing.

[3]  Jörg Henkel,et al.  Adaptive on-the-fly application performance modeling for many cores , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  Jörg Henkel,et al.  TAPE: thermal-aware agent-based power economy for multi/many-core architectures , 2009, ICCAD '09.

[5]  Wolfgang Schröder-Preikschat,et al.  DistRM: Distributed resource management for on-chip many-core systems , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[6]  Jörg Henkel,et al.  Economic learning for thermal-aware power budgeting in many-core architectures , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[7]  Jörg Henkel,et al.  ADAM: Run-time agent-based distributed application mapping for on-chip communication , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[8]  Pasi Liljeberg,et al.  Smart hill climbing for agile dynamic mapping in many-core systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  Massoud Pedram,et al.  TAPP: Temperature-aware application mapping for NoC-based many-core processors , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Yusuf Leblebici,et al.  A simulation methodology for reliability analysis in multi-core SoCs , 2006, GLSVLSI '06.

[11]  Hannu Tenhunen,et al.  SHiFA: System-level hierarchy in run-time fault-aware management of many-core systems , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Muhammad Shafique,et al.  Distributed scheduling for many-cores using cooperative game theory , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[13]  Qing Wu,et al.  A Multi-Agent Framework for Thermal Aware Task Migration in Many-Core Systems , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14]  Sheldon X.-D. Tan,et al.  Distributed task migration for thermal hot spot reduction in many-core microprocessors , 2013, 2013 IEEE 10th International Conference on ASIC.

[15]  Juha Plosila,et al.  Mapping multiple applications with unbounded and bounded number of cores on many-core networks-on-chip , 2013, Microprocess. Microsystems.

[16]  Stijn Eyerman,et al.  An Evaluation of High-Level Mechanistic Core Models , 2014, ACM Trans. Archit. Code Optim..

[17]  Amit Kumar Singh,et al.  Mapping on multi/many-core systems: Survey of current and emerging trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  G. Edward Suh,et al.  Prediction-guided performance-energy trade-off for interactive applications , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Kevin Skadron,et al.  HotSpot 6.0: Validation, Acceleration and Extension , 2015 .

[20]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[21]  Iraklis Anagnostopoulos,et al.  Distributed run-time resource management for malleable applications on many-core platforms , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[22]  Diana Marculescu,et al.  Distributed reinforcement learning for power limited many-core system performance optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[23]  Fernando Gehm Moraes,et al.  Heuristics for Dynamic Task Mapping in NoC-based Heterogeneous MPSoCs , 2007, 18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07).

[24]  Jörg Henkel,et al.  TAPE: Thermal-aware agent-based power econom multi/many-core architectures , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[25]  Elon Bauer,et al.  Thermal Management Using PCM-Based Heatsinks , 2014 .

[26]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[27]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.