Performance constraint-aware task mapping to optimize lifetime reliability of manycore systems

Negative bias temperature instability (NBTI) has emerged as a critical challenge to lifetime reliability of computing systems. Traditionally, temperature-aware methodologies are used to mitigate the impact of NBTI on aging and degradation of computing systems. However, in the presence of process variation, which is the norm in manycore processors, temperature-aware techniques are inefficient in improving lifetime reliability and can result in poor performance. In this paper, we propose a novel performance constraint-aware task mapping technique to improve lifetime reliability by mitigating NBTI considering on-chip process variation. Our approach consists of two phases, namely design-time and run-time. During design time, we generate Pareto-optimal mappings. Following which, our run-time technique judiciously intervenes to perform workload migration to save the weakest processing core. We compare our approach with performance-greedy and thermal-aware task mapping techniques. The experiment results demonstrate that our approach outperforms other two techniques and improves lifetime reliability of a manycore system as much as 54% without violating the throughput constraint.

[1]  Bharadwaj Veeravalli,et al.  Reinforcement learning-based inter- and intra-application thermal optimization for lifetime improvement of multicore systems , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Roman L. Lysecky,et al.  Workload assignment considering NBTI degradation in multicore systems , 2014, ACM J. Emerg. Technol. Comput. Syst..

[3]  Ku He,et al.  Temperature-aware NBTI modeling and the impact of input vector control on performance degradation , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[4]  Yu Cao,et al.  Predictive Modeling of the NBTI Effect for Reliable Design , 2006, IEEE Custom Integrated Circuits Conference 2006.

[5]  Mattan Erez,et al.  NBTI-aware DVFS: A new approach to saving energy and increasing processor lifetime , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[6]  Josep Torrellas,et al.  Facelift: Hiding and slowing down aging in multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[7]  David Blaauw,et al.  Compact Degradation Sensors for Monitoring NBTI and Oxide Degradation , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  J. Babcock,et al.  Dynamic recovery of negative bias temperature instability in p-type metal–oxide–semiconductor field-effect transistors , 2003 .

[9]  Luca Benini,et al.  Workload and user experience-aware Dynamic Reliability Management in multicore processors , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Yu (Kevin) Cao,et al.  What is Predictive Technology Model (PTM)? , 2009, SIGD.

[11]  Gerhard W. Dueck,et al.  Threshold accepting: a general purpose optimization algorithm appearing superior to simulated anneal , 1990 .