Thermal-Aware Task Mapping on Dynamically Reconfigurable Network-on-Chip Based Multiprocessor System-on-Chip

Dark silicon is the phenomenon that a fraction of many-core chip has to be turned off or run in a low-power state in order to maintain the safe chip temperature. System-level thermal management techniques normally map application on non-adjacent cores, while communication efficiency among these cores will be oppositely affected over conventional network-on-chip (NoC). Recently, SMART NoC architecture is proposed, enabling single-cycle multi-hop bypass channels to be built between distant cores at runtime, to reduce communication latency. However, communication efficiency of SMART NoC will be diminished by communication contention, which will in turn decrease system performance. In this paper, we first propose an Integer-Linear Programming (ILP) model to properly address communication problem, which generates the optimal solutions with the consideration of inter-processor communication. We further present a novel heuristic algorithm for task mapping in dark silicon many-core systems, called TopoMap, on top of SMART architecture, which can effectively solve communication contention problem in polynomial time. With fine-grained consideration of chip thermal reliability and inter-processor communication, presented approaches are able to control the reconfigurability of NoC communication topology in task mapping and scheduling. Thermal-safe system is guaranteed by physically decentralized active cores, and communication overhead is reduced by the minimized communication contention and maximized bypass routing. Performance evaluation on PARSEC shows the applicability and effectiveness of the proposed techniques, which achieve on average 42.5 and 32.4 percent improvement in communication and application performance, and 32.3 percent reduction in system energy consumption, compared with state-of-the-art techniques. TopoMap only introduces 1.8 percent performance difference compared to ILP model and is more scalable to large-size NoCs.

[1]  Rami G. Melhem,et al.  MSCS: Multi-hop Segmented Circuit Switching , 2015, ACM Great Lakes Symposium on VLSI.

[2]  Lei Zhou,et al.  Optimal Functional-Unit Assignment for Heterogeneous Systems Under Timing Constraint , 2017, IEEE Transactions on Parallel and Distributed Systems.

[3]  Edwin Hsing-Mean Sha,et al.  Application Mapping and Scheduling for Network-on-Chip-Based Multiprocessor System-on-Chip With Fine-Grain Communication Optimization , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[4]  Yuan Xie,et al.  NoC-sprinting: Interconnect for fine-grained sprinting in the dark silicon era , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[5]  Muhammad Shafique,et al.  MatEx: Efficient transient and peak temperature computation for compact thermal models , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Fernando Gehm Moraes,et al.  Heuristics for Dynamic Task Mapping in NoC-based Heterogeneous MPSoCs , 2007, 18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07).

[7]  Li-Shiuan Peh,et al.  Single-cycle collective communication over a shared network fabric , 2014, 2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS).

[8]  Simon J. Hollis,et al.  Skip-links: A dynamically reconfiguring topology for energy-efficient NoCs , 2010, 2010 International Symposium on System on Chip.

[9]  William J. Dally,et al.  Express Cubes: Improving the Performance of k-Ary n-Cube Interconnection Networks , 1989, IEEE Trans. Computers.

[10]  Li-Shiuan Peh,et al.  Smart: Single-Cycle Multihop Traversals over a Shared Network on Chip , 2014, IEEE Micro.

[11]  Wei Zhang,et al.  Distributed Sensor Network-on-Chip for Performance Optimization of Soft-Error-Tolerant Multiprocessor System-on-Chip , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Muhammad Shafique,et al.  Dark silicon as a challenge for hardware/software co-design , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[13]  Xiangke Liao,et al.  Exploiting contention and congestion aware switch allocation in network-on-chips , 2017, ACM TUR-C '17.

[14]  Zhang Chao,et al.  State-of-the-Art Survey on Software-Defined Networking(SDN) , 2015 .

[15]  Manfred Glesner,et al.  Runtime Contention and Bandwidth-Aware Adaptive Routing Selection Strategies for Networks-on-Chip , 2013, IEEE Transactions on Parallel and Distributed Systems.

[16]  Anantha Chandrakasan,et al.  SMART: A single-cycle reconfigurable NoC for SoC applications , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Heba Khdr,et al.  Thermal constrained resource management for mixed ILP-TLP workloads in dark silicon chips , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Edwin Hsing-Mean Sha,et al.  Optimal functional-unit assignment and buffer placement for probabilistic pipelines , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[19]  Muhammad Shafique,et al.  The EDA challenges in the dark silicon era , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[20]  Kevin Skadron,et al.  HotSpot: a compact thermal modeling methodology for early-stage VLSI design , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[21]  Edwin Hsing-Mean Sha,et al.  FoToNoC: A hierarchical management strategy based on folded lorus-like Network-on-Chip for dark silicon many-core systems , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[22]  Radu Marculescu,et al.  "It's a small world after all": NoC performance optimization via long-range link insertion , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Zhonghai Lu,et al.  ACO-Based Thermal-Aware Thread-to-Core Mapping for Dark-Silicon-Constrained CMPs , 2017, IEEE Transactions on Electron Devices.

[24]  Edwin Hsing-Mean Sha,et al.  FoToNoC: A Folded Torus-Like Network-on-Chip Based Many-Core Systems-on-Chip in the Dark Silicon Era , 2017, IEEE Transactions on Parallel and Distributed Systems.

[25]  Muhammad Shafique,et al.  darkNoC: Designing energy-efficient network-on-chip with multi-Vt cells for dark silicon , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[26]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[27]  Xiaobo Sharon Hu,et al.  Temperature-Aware Scheduling and Assignment for Hard Real-Time Applications on MPSoCs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[28]  Hamid Sarbazi-Azad,et al.  An Efficient Hybrid-Switched Network-on-Chip for Chip Multiprocessors , 2016, IEEE Transactions on Computers.

[29]  Heba Khdr,et al.  Thermal safe power: Efficient thermal-aware power budgeting for manycore systems in dark silicon , 2017 .

[30]  Alberto Sangiovanni-Vincentelli,et al.  Classification, Customization, and Characterization: Using MILP for Task Allocation and Scheduling , 2006 .

[31]  Wei Zhang,et al.  Traffic-Aware Application Mapping for Network-on-Chip Based Multiprocessor System-on-Chip , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[32]  Rajesh Gupta,et al.  Network topology exploration of mesh-based coarse-grain reconfigurable architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[33]  Edwin Hsing-Mean Sha,et al.  Dark silicon-aware hardware-software collaborated design for heterogeneous many-core systems , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[34]  Amit Kumar Singh,et al.  Mapping on multi/many-core systems: Survey of current and emerging trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Yajun Ha,et al.  Communication-aware application mapping and scheduling for NoC-based MPSoCs , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[36]  Li-Shiuan Peh,et al.  Breaking the on-chip latency barrier using SMART , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[37]  Edwin Hsing-Mean Sha,et al.  On the Design of Minimal-Cost Pipeline Systems Satisfying Hard/Soft Real-Time Constraints , 2021, IEEE Transactions on Emerging Topics in Computing.

[38]  Nectarios Koziris,et al.  An efficient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures , 2000, Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing.

[39]  Ali Afzali-Kusha,et al.  Negative Exponential Distribution Traffic Pattern for Power/Performance Analysis of Network on Chips , 2009, 2009 22nd International Conference on VLSI Design.

[40]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[41]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[42]  An-Yeu Wu,et al.  Regional ACO-Based Cascaded Adaptive Routing for Traffic Balancing in Mesh-Based Network-on-Chip Systems , 2015, IEEE Transactions on Computers.

[43]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[44]  An-Yeu Wu,et al.  Path-Congestion-Aware Adaptive Routing With a Contention Prediction Scheme for Network-on-Chip Systems , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[45]  Amit Kumar Singh,et al.  Resource and Throughput Aware Execution Trace Analysis for Efficient Run-Time Mapping on MPSoCs , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[46]  Chao Chen,et al.  Hardware-software collaboration for dark silicon heterogeneous many-core systems , 2017, Future Gener. Comput. Syst..

[47]  Jürgen Teich,et al.  Power Density-Aware Resource Management for Heterogeneous Tiled Multicores , 2017, IEEE Transactions on Computers.

[48]  Heba Khdr,et al.  Scalable probabilistic power budgeting for many-cores , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[49]  Jing-Jia Liou,et al.  A fast and accurate network-on-chip timing simulator with a flit propagation model , 2015, The 20th Asia and South Pacific Design Automation Conference.

[50]  Pasi Liljeberg,et al.  Smart hill climbing for agile dynamic mapping in many-core systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).