Reliable computing with ultra-reduced instruction set co-processors

This work presents a method to reliably perform computations in the presence of hard faults arising from aggressive technology scaling, and design defects from human error. Our method is based on an observation that a single Turing-complete instruction can mirror the semantics of any other instruction. One such instruction is the subleq instruction, which has been used for instructional purposes in the past. We find that the scope for using such a Turing-complete instruction is far greater, and in this paper, we present its applicability to fault tolerance. In particular, we extend a MIPS processor with a co-processor (called ultra-reduced instruction set co-processor - URISC) that implements the subleq instruction. We use the URISC to execute sequences of subleq that are semanti-cally equivalent to the faulty instructions. We formally prove this, and implement the translations in the back-end of the LLVM compiler. We generate binaries for our hardware prototype called MIPS-URISC, which we synthesize and execute on an Altera FPGA. Our experiments indicate the performance and area overheads, and the efficacy of the proposed approach.

[1]  James E. Smith,et al.  Configurable isolation: building high availability systems with commodity multi-core processors , 2007, ISCA '07.

[2]  Albert Meixner,et al.  Detouring: Translating software to circumvent hard faults in simple cores , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[3]  Muhammad Shafique,et al.  Reliable software for unreliable hardware: Embedded code generation aiming at reliability , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[4]  Sarita V. Adve,et al.  Architectures for online error detection and recovery in multicore processors , 2011, 2011 Design, Automation & Test in Europe.

[5]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[6]  Shantanu Gupta,et al.  Architectural core salvaging in a multi-core processor for hard-error tolerance , 2009, ISCA '09.

[7]  Mihalis Psarakis,et al.  Accelerating microprocessor silicon validation by exposing ISA diversity , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[8]  Albert Meixner,et al.  Argus: Low-Cost, Comprehensive Error Detection in Simple Cores , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  Shubhendu S. Mukherjee,et al.  Detailed design and evaluation of redundant multi-threading alternatives , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[10]  Behrooz Parhami,et al.  URISC: The Ultimate Reduced Instruction Set Computer , 1988 .