SBST for on-line detection of hard faults in multiprocessor applications under energy constraints

Software-Based Self-Test (SBST) has emerged as an effective method for on-line testing of processors integrated in non safety-critical systems. However, especially for multi-core processors, the notion of dependability encompasses not only high quality on-line tests with minimum performance overhead but also methods for preventing the generation of excessive power and heat that exacerbate silicon aging mechanisms and can cause long term reliability problems. In this paper, we initially extend the capabilities of a multiprocessor simulator in order to evaluate the overhead in the execution of the useful application load in terms of both performance and energy consumption. We utilize the derived power evaluation framework to assess the overhead of SBST implemented as a test thread in a multiprocessor environment. A range of typical processor configurations is considered. The application load consists of some representative SPEC benchmarks, and various scenarios for the execution of the test thread are studied (sporadic or continuous execution). Finally, we apply in a multiprocessor context an energy optimization methodology that was originally proposed to increase battery life for battery-powered devices. The methodology reduces significantly the energy and performance overhead without affecting the test coverage of the SBST routines.

[1]  Onur Mutlu,et al.  Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[2]  Dimitris Gizopoulos,et al.  Low Energy On-Line SBST of Embedded Processors , 2008, 2008 IEEE International Test Conference.

[3]  Yervant Zorian,et al.  On-Line Testing for VLSI—A Compendium of Approaches , 1998, J. Electron. Test..

[4]  Ismet Bayraktaroglu,et al.  Cache Resident Functional Microprocessor Testing: Avoiding High Speed IO Issues , 2006, 2006 IEEE International Test Conference.

[5]  Jian Shen,et al.  Synthesis of Native Mode Self-Test Programs , 1998, J. Electron. Test..

[6]  Kewal K. Saluja,et al.  Testing of hard faults in simultaneous multi-threaded processors , 2004, Proceedings. 10th IEEE International On-Line Testing Symposium.

[7]  Mihalis Psarakis,et al.  Software-Based Self-Testing of Symmetric Shared-Memory Multiprocessors , 2009, IEEE Transactions on Computers.

[8]  Shubhendu S. Mukherjee,et al.  Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.

[9]  Kewal K. Saluja,et al.  Fault tolerance through re-execution in multiscalar architecture , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[10]  Dimitris Gizopoulos,et al.  Software-based self-testing of embedded processors , 2005, IEEE Transactions on Computers.

[11]  Mihalis Psarakis,et al.  Exploiting Thread-Level Parallelism in Functional Self-Testing of CMT Processors , 2009, 2009 14th IEEE European Test Symposium.

[12]  Constantin Halatsis,et al.  Optimal Periodic Testing of Intermittent Faults In Embedded Pipelined Processor Applications , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[13]  Omer Khan,et al.  A self-adaptive system architecture to address transistor aging , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Dimitris Gizopoulos,et al.  Effective software-based self-test strategies for on-line periodic testing of embedded processors , 2004 .

[15]  Eric Rotenberg,et al.  AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[16]  Giovanni Squillero,et al.  On the transformation of manufacturing test sets into on-line test sets for microprocessors , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[17]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[18]  Subhasish Mitra,et al.  CASP: Concurrent Autonomous Chip Self-Test Using Stored Test Patterns , 2008, 2008 Design, Automation and Test in Europe.

[19]  Dimitris Gizopoulos,et al.  Online Periodic Self-Test Scheduling for Real-Time Processor-Based Systems Dependability Enhancement , 2009, IEEE Transactions on Dependable and Secure Computing.

[20]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[21]  Dimitris Gizopoulos,et al.  Hybrid-SBST Methodology for Efficient Testing of Processor Cores , 2008, IEEE Design & Test of Computers.

[22]  Sujit Dey,et al.  A scalable software-based self-test methodology for programmable processors , 2003, DAC '03.

[23]  Margaret Martonosi,et al.  Power-performance simulation: design and validation strategies , 2004, PERV.

[24]  Kwang-Ting Cheng,et al.  Simulation-based target test generation techniques for improving the robustness of a software-based-self-test methodology , 2005, IEEE International Conference on Test, 2005..

[25]  Giovanni Squillero,et al.  Fully automatic test program generation for microprocessor cores , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.