A Linux-governor based Dynamic Reliability Manager for android mobile devices

Reliability is a major concern in multiprocessors. Dynamic Reliability Management (DRM) aims at trading off processor performance with lifetime. The state-of-the-art publications study only the theory supported by simulation. This paper presents the first complete software implementation, working on a real hardware, of a low-overhead, Android-compatible workload-aware DRM Governor for mobile multiprocessors. We discuss the design challenges and the run-time overhead involved. We show the effectiveness of our governor in guaranteeing the predefined target lifetime and show that it achieves up to 100% of lifetime improvement with respect to traditional governors, while providing comparable performance for critical applications.

[1]  David Blaauw,et al.  Process variation and temperature-aware reliability management , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[2]  C.H. van Berkel,et al.  Multi-core for mobile phones , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[3]  Philippe Maurine,et al.  Embedding statistical tests for on-chip dynamic voltage and temperature monitoring , 2012, DAC Design Automation Conference 2012.

[4]  Luca Benini,et al.  Workload and user experience-aware Dynamic Reliability Management in multicore processors , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[5]  Lara Dolecek,et al.  Underdesigned and Opportunistic Computing in Presence of Hardware Variability , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  Venkatesh Pallipadi,et al.  The Ondemand Governor Past, Present, and Future , 2010 .

[7]  Bashir M. Al-Hashimi,et al.  Cost-Effective TSV Grouping for Yield Improvement of 3D-ICs , 2011, 2011 Asian Test Symposium.

[8]  Shuguang Feng,et al.  Self-calibrating Online Wearout Detection , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  David Blaauw,et al.  Compact Degradation Sensors for Monitoring NBTI and Oxide Degradation , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Andrea Bartolini,et al.  Dynamic power management: from portable devices to high performance computing , 2011 .

[11]  Deborah Estrin,et al.  Diversity in smartphone usage , 2010, MobiSys '10.

[12]  Kees van Berkel,et al.  Multi-core for mobile phones , 2009, DATE.

[13]  David Blaauw,et al.  Dynamic NBTI Management Using a 45 nm Multi-Degradation Sensor , 2011, IEEE Trans. Circuits Syst. I Regul. Pap..

[14]  Keith A. Bowman,et al.  Impact of die-to-die and within-die parameter variations on the throughput distribution of multi-core processors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[15]  Luca Benini,et al.  Aging-Aware Energy-Efficient Workload Allocation for Mobile Multimedia Platforms , 2013, IEEE Transactions on Parallel and Distributed Systems.

[16]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[17]  Luca Benini,et al.  Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller , 2013, IEEE Transactions on Parallel and Distributed Systems.

[18]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[19]  David Blaauw,et al.  Reliability modeling and management in dynamic microprocessor-based systems , 2006, 2006 43rd ACM/IEEE Design Automation Conference.