A survey of cross-layer power-reliability tradeoffs in multi and many core systems-on-chip

As systems-on-chip increase in complexity, the underlying technology presents us with significant challenges due to increased power consumption as well as decreased reliability. Today, designers must consider building systems that achieve the requisite functionality and performance using components that may be unreliable. In order to do so, it is crucial to understand the close interplay between the different layers of a system: technology, platform, and application. This will enable the most general tradeoff exploration, reaping the most benefits in power, performance and reliability. This paper surveys various cross layer techniques and approaches for power, performance, and reliability tradeoffs are technology, circuit, architecture and application layers.

[1]  Kaushik Roy,et al.  Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Paolo A. Aseron,et al.  A 45 nm Resilient Microprocessor Core for Dynamic Variation Tolerance , 2011, IEEE Journal of Solid-State Circuits.

[3]  Suphachai Sutanthavibul,et al.  Statistical approach to low power and high volume Pineview Atom-based SoC design , 2009, 2009 International SoC Design Conference (ISOCC).

[4]  Augustus K. Uht Going beyond worst-case specs with TEAtime , 2004, Computer.

[5]  Michael Engel,et al.  Classification-Based Improvement of Application Robustness and Quality of Service in Probabilistic Computer Systems , 2012, ARCS.

[6]  Ahmed M. Eltawil,et al.  Fault Tolerant Approaches Targeting Ultra Low Power Communications System Design , 2007, 2007 IEEE 65th Vehicular Technology Conference - VTC2007-Spring.

[7]  K. Steinhubl Design of Ion-Implanted MOSFET'S with Very Small Physical Dimensions , 1974 .

[8]  John Sartori,et al.  Designing a processor from the ground up to allow voltage/reliability tradeoffs , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[9]  David Blaauw,et al.  Bubble Razor: An architecture-independent approach to timing-error detection and correction , 2012, 2012 IEEE International Solid-State Circuits Conference.

[10]  P. Stolk,et al.  Modeling statistical dopant fluctuations in MOS transistors , 1998 .

[11]  Petru Eles,et al.  Quasi-static voltage scaling for energy minimization with time constraints , 2005, Design, Automation and Test in Europe.

[12]  Chiara Sandionigi,et al.  A Novel Design Methodology for Implementing Reliability-Aware Systems on SRAM-Based FPGAs , 2011, IEEE Transactions on Computers.

[13]  Ahmed M. Eltawil,et al.  Error-Aware Design , 2007 .

[14]  Kelin Kuhn,et al.  Variability in nanoscale CMOS technology , 2011, Science China Information Sciences.

[15]  Meng-Fan Chang,et al.  A 260mV L-shaped 7T SRAM with bit-line (BL) Swing expansion schemes based on boosted BL, asymmetric-VTH read-port, and offset cell VDD biasing techniques , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[16]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[17]  Babak Falsafi,et al.  Toward Dark Silicon in Servers , 2011, IEEE Micro.

[18]  David Blaauw,et al.  Opportunities and challenges for better than worst-case design , 2005, ASP-DAC.

[19]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[20]  Benton H. Calhoun,et al.  5T SRAM With Asymmetric Sizing for Improved Read Stability , 2011, IEEE Journal of Solid-State Circuits.

[21]  Nikil D. Dutt,et al.  E < MC2: less energy through multi-copy cache , 2010, CASES '10.

[22]  Mahadev Satyanarayanan,et al.  Experience with adaptive mobile applications in Odyssey , 1999, Mob. Networks Appl..

[23]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[24]  J. Meindl,et al.  The impact of intrinsic device fluctuations on CMOS SRAM cell stability , 2001, IEEE J. Solid State Circuits.

[25]  David I. August,et al.  SWIFT: software implemented fault tolerance , 2005, International Symposium on Code Generation and Optimization.

[26]  David I. August,et al.  Configurable Transient Fault Detection via Dynamic Binary Translation , 2006 .

[27]  Naresh R. Shanbhag,et al.  Low-power filtering via adaptive error-cancellation , 2003, IEEE Trans. Signal Process..

[28]  Kaushik Roy,et al.  A 32kb 10T Subthreshold SRAM Array with Bit-Interleaving and Differential Read Scheme in 90nm CMOS , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[29]  B. Daneshrad,et al.  Dual antenna UMTS mobile station transceiver ASIC for 2 Mb/s data rate , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[30]  J. Tschanz,et al.  Tunable replica circuits and adaptive voltage-frequency techniques for dynamic voltage, temperature, and aging variation tolerance , 2009, 2009 Symposium on VLSI Circuits.

[31]  Avesta Sasan,et al.  Process Variation Aware SRAM/Cache for aggressive voltage-frequency scaling , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[32]  Sani R. Nassif,et al.  High Performance CMOS Variability in the 65nm Regime and Beyond , 2007 .

[33]  C. Auth,et al.  A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[34]  Ying Zhang,et al.  A 4.0 GHz 291Mb voltage-scalable SRAM design in 32nm high-κ metal-gate CMOS with integrated power management , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[35]  Rouwaida Kanj,et al.  System-level SRAM yield enhancement , 2006, 7th International Symposium on Quality Electronic Design (ISQED'06).

[36]  Sangwoo Pae,et al.  Frequency and recovery effects in high-κ BTI degradation , 2009, 2009 IEEE International Reliability Physics Symposium.

[37]  Douglas L. Jones,et al.  Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[38]  Rouwaida Kanj,et al.  Cross Layer Error Exploitation for Aggressive Voltage Scaling , 2007, 8th International Symposium on Quality Electronic Design (ISQED'07).

[39]  Naresh R. Shanbhag,et al.  Error-Resilient Low-Power Viterbi Decoder Architectures , 2009, IEEE Transactions on Signal Processing.

[40]  John Sartori,et al.  Stochastic computing: Embracing errors in architecture and design of processors and applications , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

[41]  Quinn Jacobson,et al.  ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.

[42]  Alaa R. Alameldeen,et al.  Trading off Cache Capacity for Reliability to Enable Low Voltage Operation , 2008, 2008 International Symposium on Computer Architecture.

[43]  James Tschanz,et al.  Capacitive-coupling wordline boosting with self-induced VCC collapse for write VMIN reduction in 22-nm 8T SRAM , 2012, 2012 IEEE International Solid-State Circuits Conference.

[44]  Avesta Sasan,et al.  A fault tolerant cache architecture for sub 500mV operation: resizable data composer cache (RDC-cache) , 2009, CASES '09.

[45]  N. Vallepalli,et al.  SRAM design on 65nm CMOS technology with integrated leakage reduction scheme , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[46]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.

[47]  Karthikeyan Sankaralingam,et al.  Relax: an architectural framework for software recovery of hardware faults , 2010, ISCA.

[48]  Sung Woo Chung,et al.  Fine-Grain Voltage Tuned Cache Architecture for Yield Management Under Process Variations , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[49]  Kevin Zhang,et al.  A 4.6GHz 162Mb SRAM design in 22nm tri-gate CMOS technology with integrated active VMIN-enhancing assist circuitry , 2012, 2012 IEEE International Solid-State Circuits Conference.

[50]  Eric Karl,et al.  Dynamic behavior of SRAM data retention and a novel transient voltage collapse technique for 0.6V 32nm LP SRAM , 2011, 2011 International Electron Devices Meeting.

[51]  David Blaauw,et al.  The limit of dynamic voltage scaling and insomniac dynamic voltage scaling , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[52]  Michael Engel,et al.  Improving the fault resilience of an H.264 decoder using static analysis methods , 2013, TECS.

[53]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[54]  Ahmed M. Eltawil,et al.  Low-Power Multimedia System Design by Aggressive Voltage Scaling , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[55]  A.P. Chandrakasan,et al.  Ultra-dynamic Voltage scaling (UDVS) using sub-threshold operation and local Voltage dithering , 2006, IEEE Journal of Solid-State Circuits.

[56]  Hiroyuki Yamauchi A Discussion on SRAM Circuit Design Trend in Deeper Nanometer-Scale Technologies , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[57]  Kaushik Roy,et al.  A 32 kb 10T Sub-Threshold SRAM Array With Bit-Interleaving and Differential Read Scheme in 90 nm CMOS , 2009, IEEE Journal of Solid-State Circuits.

[58]  R.H. Dennard,et al.  An 8T-SRAM for Variability Tolerance and Low-Voltage Operation in High-Performance Caches , 2008, IEEE Journal of Solid-State Circuits.

[59]  Naresh R. Shanbhag Reliable and energy-efficient digital signal processing , 2002, DAC '02.

[60]  Brian D. Noble System support for mobile, adaptive applications , 2000, IEEE Wirel. Commun..

[61]  Avesta Sasan,et al.  Limits on voltage scaling for caches utilizing fault tolerant techniques , 2007, 2007 25th International Conference on Computer Design.