A lifetime-aware runtime mapping approach for many-core systems in the dark silicon era

In this paper, we propose a novel lifetime reliability-aware resource management approach for many-core architectures. The approach is based on hierarchical architecture, composed of a long-term runtime reliability analysis unit and a short-term runtime mapping unit. The former periodically analyses the aging status of the various processing units with respect to a target value specified by the designer, and performs recovery actions on highly stressed cores. The calculated reliability metrics are utilized in runtime mapping of the newly arrived applications to maximize the performance of the system while fulfilling reliability requirements and the available power budget. Our extensive experimental results reveal that the proposed reliability-aware approach can efficiently select the processing cores to be used over time in order to enhance the reliability at the end of the operational life (up to 62%) while offering the comparable performance level of the state-of-the-art runtime mapping approach.

[1]  Cristinel Ababei,et al.  Unified reliability estimation and management of NoC based chip multiprocessors , 2014, Microprocess. Microsystems.

[2]  Josep Torrellas,et al.  Facelift: Hiding and slowing down aging in multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[3]  Michael Bedford Taylor,et al.  Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse , 2012, DAC Design Automation Conference 2012.

[4]  Bharadwaj Veeravalli,et al.  Run-time mapping for reliable many-cores based on energy/performance trade-offs , 2013, 2013 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS).

[5]  Donald E. Thomas,et al.  Lifetime improvement through runtime wear-based task mapping , 2012, CODES+ISSS '12.

[6]  Axel Jantsch,et al.  Dynamic power management for many-core platforms in the dark silicon era: A multi-objective control approach , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[7]  Pasi Liljeberg,et al.  CoNA: Dynamic application mapping for congestion reduction in many-core systems , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[8]  Luca Benini,et al.  Dynamic variability management in mobile multicore processors under lifetime constraints , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[9]  Muhammad Shafique,et al.  The EDA challenges in the dark silicon era , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Qiang Xu,et al.  Characterizing the lifetime reliability of manycore processors with core-level redundancy , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[11]  Roman L. Lysecky,et al.  Workload assignment considering NBTI degradation in multicore systems , 2014, ACM J. Emerg. Technol. Comput. Syst..

[12]  Kaustav Banerjee,et al.  Analysis of substrate thermal gradient effects on optimal buffer insertion , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[13]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[14]  Fernando Gehm Moraes,et al.  Dynamic Task Mapping for MPSoCs , 2010, IEEE Design & Test of Computers.

[15]  Marco Gribaudo,et al.  A lightweight and open-source framework for the lifetime estimation of multicore systems , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[16]  Sheldon X.-D. Tan,et al.  Lifetime optimization for real-time embedded systems considering electromigration effects , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[17]  Pasi Liljeberg,et al.  Smart hill climbing for agile dynamic mapping in many-core systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Hannu Tenhunen,et al.  A Power-Aware Approach for Online Test Scheduling in Many-Core Architectures , 2016, IEEE Transactions on Computers.

[19]  Fernando Gehm Moraes,et al.  Heuristics for Dynamic Task Mapping in NoC-based Heterogeneous MPSoCs , 2007, 18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07).

[20]  Axel Jantsch,et al.  Dark silicon aware power management for manycore systems under dynamic workloads , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[21]  Qiang Xu,et al.  Energy-efficient task allocation and scheduling for multi-mode MPSoCs under lifetime reliability constraint , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[22]  Axel Jantsch,et al.  MapPro: Proactive Runtime Mapping for Dynamic Workloads by Quantifying Ripple Effect of Applications on Networks-on-Chip , 2015, NOCS.