Extending Moore’s Law via Computationally Error-Tolerant Computing

Dennard scaling has ended. Lowering the voltage supply (Vdd) to sub-volt levels causes intermittent losses in signal integrity, rendering further scaling (down) no longer acceptable as a means to lower the power required by a processor core. However, it is possible to correct the occasional errors caused due to lower Vdd in an efficient manner and effectively lower power. By deploying the right amount and kind of redundancy, we can strike a balance between overhead incurred in achieving reliability and energy savings realized by permitting lower Vdd. One promising approach is the Redundant Residue Number System (RRNS) representation. Unlike other error correcting codes, RRNS has the important property of being closed under addition, subtraction and multiplication, thus enabling computational error correction at a fraction of an overhead compared to conventional approaches. We use the RRNS scheme to design a Computationally-Redundant, Energy-Efficient core, including the microarchitecture, Instruction Set Architecture (ISA) and RRNS centered algorithms. From the simulation results, this RRNS system can reduce the energy-delay-product by about 3× for multiplication intensive workloads and by about 2× in general, when compared to a non-error-correcting binary core.

[1]  L. Miles,et al.  2000 , 2000, RDH.

[2]  Christophe Jégo,et al.  A new single-error correction scheme based on self-diagnosis residue number arithmetic , 2010, 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[3]  Kaushik Roy,et al.  A Novel Low Overhead Fault Tolerant Kogge-Stone Adder Using Adaptive Clocking , 2008, 2008 Design, Automation and Test in Europe.

[4]  Behrooz Parhami,et al.  Fast RNS Division Algorithms for Fixed Divisors with Application to RSA Encrytion , 1994, Inf. Process. Lett..

[5]  Sergio Lopez-Buedo,et al.  RNS-enabled digital signal processor design , 2002 .

[6]  Chip-Hong Chang,et al.  A new algorithm for single residue digit error correction in Redundant Residue Number System , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[7]  Asif Islam Khan,et al.  Negative Capacitance in Short-Channel FinFETs Externally Connected to an Epitaxial Ferroelectric Capacitor , 2016, IEEE Electron Device Letters.

[8]  T. R. N. Rao,et al.  Biresidue Error-Correcting Codes for Computer Arithmetic , 1970, IEEE Transactions on Computers.

[9]  Christof Fetzer,et al.  ANB- and ANBDmem-Encoding: Detecting Hardware Errors in Software , 2010, SAFECOMP.

[10]  Y. Shimazaki,et al.  A shared-well dual-supply-voltage 64-bit ALU , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[11]  Thomas M. Conte,et al.  Memory System Design for Ultra Low Power, Computationally Error Resilient Processor Microarchitectures , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[12]  H. Krishna,et al.  A coding theory approach to error control in redundant residue number systems. I. Theory and single error correction , 1992 .

[13]  Sayeef Salahuddin,et al.  CMOS and Beyond: Extending CMOS with negative capacitance , 2015 .

[14]  Stafford E. Tavares,et al.  New Fault Tolerant Techniques for Residue Number Systems , 1992, IEEE Trans. Computers.

[15]  C. Fetzer,et al.  Hardware Failure Virtualization Via Software Encoded Processing , 2007, 2007 5th IEEE International Conference on Industrial Informatics.

[16]  P. Forin,et al.  VITAL CODED MICROPROCESSOR PRINCIPLES AND APPLICATION FOR VARIOUS TRANSIT SYSTEMS , 1990 .

[17]  Rajit Manohar,et al.  Fault tolerant asynchronous adder through dynamic self-reconfiguration , 2005, 2005 International Conference on Computer Design.

[18]  Mi Lu,et al.  Floating-point numbers in residue number systems , 1991 .

[19]  A. Omondi,et al.  Residue Number Systems: Theory and Implementation , 2007 .

[20]  Janak H. Patel,et al.  Concurrent Error Detection in ALU's by Recomputing with Shifted Operands , 1982, IEEE Transactions on Computers.

[21]  Radu Marculescu,et al.  Multi-domain Processors: Challenges, Design Methods, and Recent Developments , 2010 .

[22]  Parag K. Lala,et al.  A technique for modular design of self-checking carry-select adder , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[23]  K. Steinhubl Design of Ion-Implanted MOSFET'S with Very Small Physical Dimensions , 1974 .

[24]  K.-Y. Lin,et al.  A superfast algorithm for single-error correction in rrns and hardware implementation , 1993, J. VLSI Signal Process..

[25]  Osnat Keren,et al.  Arbitrary Error Detection in Combinational Circuits by Using Partitioning , 2008, 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems.

[26]  Francesco Piazza,et al.  A Systolic Redundant Residue Arithmetic Error Correction Circuit , 1993, IEEE Trans. Computers.

[27]  Xiang-Gen Xia,et al.  Error Correction in Polynomial Remainder Codes With Non-Pairwise Coprime Moduli and Robust Chinese Remainder Theorem for Polynomials , 2014, IEEE Transactions on Communications.

[28]  Daniel Spencer Anderson,et al.  Design and Implementation of an Instruction Set Architecture and an Instruction Execution Unit for the REZ9 Coprocessor System , 2014 .

[29]  Thomas M. Conte,et al.  A Brief Survey of Non-Residue Based Computational Error Correction , 2016, ArXiv.

[30]  Piero Maestrini,et al.  Error Detection and Correction by Product Codes in Residue Number Systems , 1974, IEEE Transactions on Computers.

[31]  Paul M. Solomon,et al.  In Quest of the “Next Switch”: Prospects for Greatly Reduced Power Dissipation in a Successor to the Silicon Field-Effect Transistor , 2010, Proceedings of the IEEE.

[32]  Seungjoo Kim,et al.  RSA Speedup with Chinese Remainder Theorem Immune against Hardware Fault Cryptanalysis , 2003, IEEE Trans. Computers.

[33]  Vijaya Ramachandran Single Residue Error Correction in Residue Number Systems , 1983, IEEE Trans. Computers.

[34]  Chip-Hong Chang,et al.  A non-iterative multiple residue digit error detection and correction algorithm in RRNS , 2016, IEEE Transactions on Computers.

[35]  A. P. Preethy,et al.  RNS-based logarithmic adder , 2000 .

[36]  Chao-Kai Liu,et al.  Error-Correcting-Codes in Computer Arithmetic , 1972 .

[37]  Michael J. Schulte,et al.  Using truncated multipliers in DCT and IDCT hardware accelerators , 2003, SPIE Optics + Photonics.

[38]  Wenjing Rao,et al.  Towards fault tolerant parallel prefix adders in nanoelectronic systems , 2008, 2008 Design, Automation and Test in Europe.

[39]  D. V. Smirnov,et al.  A method of monitoring execution of arithmetic operations on computers in computerized monitoring and measuring systems , 2008 .

[40]  C. Hu,et al.  Ferroelectric negative capacitance MOSFET: Capacitance tuning & antiferroelectric operation , 2011, 2011 International Electron Devices Meeting.

[41]  Michael Nicolaidis,et al.  Carry checking/parity prediction adders and ALUs , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[42]  Eric Schwarz,et al.  Self Checking in Current Floating-Point Units , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[43]  Dana Ron,et al.  Chinese remaindering with errors , 2000, IEEE Trans. Inf. Theory.

[44]  Minxuan Zhang,et al.  Cost effective soft error mitigation for parallel adders by exploiting inherent redundancy , 2010, 2010 IEEE International Conference on Integrated Circuit Design and Technology.

[45]  Ramesh Karri,et al.  Fault Identification in Reconfigurable Carry Lookahead Adders Targeting Nanoelectronic Fabrics , 2006, Eleventh IEEE European Test Symposium (ETS'06).

[46]  Mojtaba Valinataj,et al.  Fault Tolerant Arithmetic Operations with Multiple Error Detection and Correction , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[47]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[48]  Aviral Shrivastava,et al.  Exploiting residue number system for power-efficient digital signal processing in embedded processors , 2009, CASES '09.

[49]  Francesco Piazza,et al.  Fast Combinatorial RNS Processors for DSP Applications , 1995, IEEE Trans. Computers.

[50]  David T. Brown Error Detecting and Correcting Binary Codes for Arithmetic Operations , 1960, IRE Trans. Electron. Comput..

[51]  Rajendra S. Katti,et al.  A New Residue Arithmetic Error Correction Scheme , 1996, IEEE Trans. Computers.

[52]  Meeta Sharma Gupta,et al.  DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[53]  S. Datta,et al.  Can the subthreshold swing in a classical FET be lowered below 60 mV/decade? , 2008, 2008 IEEE International Electron Devices Meeting.

[54]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[55]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[56]  E. Mizan,et al.  Self-Imposed Temporal Redundancy: An Efficient Technique to Enhance the Reliability of Pipelined Functional Units , 2007 .

[57]  Jianhao Hu,et al.  New Error Control Algorithms for Residue Number System Codes , 2016 .

[58]  Shlomi Dolev,et al.  Preserving Hamming Distance in Arithmetic and Logical Operations , 2013, J. Electron. Test..

[59]  Parag K. Lala,et al.  Self-Checking Carry-Select Adder Design Based on Two-Rail Encoding , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[60]  Gian Carlo Cardarilli,et al.  RNS-to-binary conversion for efficient VLSI implementation , 1998 .

[61]  W. K. Jenkins,et al.  Redundant residue number systems for error detection and correction in digital filters , 1980 .

[62]  Thomas N. Theis (Keynote) In Quest of a Fast, Low-Voltage Digital Switch , 2012 .

[63]  Michael Nicolaidis,et al.  Design of fault-secure parity-prediction Booth multipliers , 1998, Proceedings Design, Automation and Test in Europe.

[64]  J. Mathew,et al.  Multiple Bit Error Detection and Correction in GF Arithmetic Circuits , 2010, 2010 International Symposium on Electronic System Design.

[65]  J. Neumann Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[66]  Laurent Imbert,et al.  a full RNS implementation of RSA , 2004, IEEE Transactions on Computers.

[67]  Richard I. Tanaka,et al.  Residue arithmetic and its applications to computer technology , 1967 .

[68]  K. Y. Lin,et al.  Computational Number Theory and Digital Signal Processing: Fast Algorithms and Error Control Techniques , 1994 .

[69]  Michael Nicolaidis,et al.  Efficient implementations of self-checking multiply and divide arrays , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[70]  R. W. Watson,et al.  Self-checked computation using residue arithmetic , 1966 .

[71]  Michael Gössel,et al.  New Self-checking Output-Duplicated Booth Multiplier with High Fault Coverage for Soft Errors , 2005, 14th Asian Test Symposium (ATS'05).

[72]  A. P. Preethy,et al.  A 36-bit balanced moduli MAC architecture , 1999, 42nd Midwest Symposium on Circuits and Systems (Cat. No.99CH36356).

[73]  Hao-Yung Lo,et al.  Parallel Algorithms for Residue Scaling and Error Correction in Residue Arithmetic , 2013 .

[74]  Barry W. Johnson,et al.  Efficient use of time and hardware redundancy for concurrent error detection in a 32-bit VLSI adder , 1988 .

[75]  Mohammad Umar Siddiqi,et al.  Multiple error detection and correction based on redundant residue number systems , 2008, IEEE Transactions on Communications.

[76]  M. Gribaudo,et al.  2002 , 2001, Cell and Tissue Research.

[77]  E. E. Swartzlander,et al.  Time redundant error correcting adders and multipliers , 1992, Proceedings 1992 IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems.

[78]  Stephen S. Yau,et al.  Error Correction in Redundant Residue Number Systems , 1973, IEEE Trans. Computers.

[79]  Balasubramaniam Natarajan,et al.  Performance of Systematic RRNS Based Space-Time Block Codes with Probability-Aware Adaptive Demapping , 2013, IEEE Transactions on Wireless Communications.

[80]  Said Hamdioui,et al.  Redundant Residue Number System Code for Fault-Tolerant Hybrid Memories , 2011, JETC.

[81]  Ahmad A. Hiasat,et al.  On the Theory of Error Control Based on Moduli with Common Factors , 2001, Reliab. Comput..

[82]  Thomas M. Conte,et al.  Computationally-redundant energy-efficient processing for y'all (CREEPY) , 2016, 2016 IEEE International Conference on Rebooting Computing (ICRC).

[83]  Hao-Yung Lo,et al.  An Algorithm for Scaling and Single Residue Error Correction in Residue Number Systems , 1990, IEEE Trans. Computers.

[84]  J. von Neumann,et al.  Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[85]  Julien Eynard,et al.  Multi-fault Attack Detection for RNS Cryptographic Architecture , 2016, 2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH).

[86]  Christof Fetzer,et al.  AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware , 2009, SAFECOMP.

[87]  H. Krishna,et al.  A coding theory approach to error control in redundant residue number systems. II. Multiple error detection and correction , 1992 .

[88]  Lei Li,et al.  A new algorithm for single error correction In RRNS , 2013, 2013 International Conference on Communications, Circuits and Systems (ICCCAS).

[89]  Eric B. Olsen Introduction of the Residue Number Arithmetic Logic Unit With Brief Computational Complexity Analysis , 2015, ArXiv.

[90]  Chip-Hong Chang,et al.  Residue Number Systems: A New Paradigm to Datapath Optimization for Low-Power and High-Performance Digital Signal Processing Applications , 2015, IEEE Circuits and Systems Magazine.

[91]  David Blaauw,et al.  Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation , 2003, MICRO.