Analysis of on-line self-testing policies for real-time embedded multiprocessors in DSM technologies

Advances in DSM technologies have a negative impact on yield and reliability of digital circuits. On-line self-testing is an interesting solution for detecting permanent and intermittent faults in non safety critical and real-time embedded multiprocessors. In this paper, we describe and evaluate three scheduling and allocation policies for on-line self-testing. We show that a policy that periodically applies a test procedure to the different processors in a way that considers idle times, test history of processors and task priorities offers a good trade-off between performance and fault detection probability.

[1]  Shekhar Y. Borkar,et al.  Microarchitecture and Design Challenges for Gigascale Integration , 2004, MICRO.

[2]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[3]  Nicolas Ventroux,et al.  SCMP architecture: an asymmetric multiprocessor system-on-chip for dynamic applications , 2010, IFMT '10.

[4]  Dimitris Gizopoulos,et al.  Effective software-based self-test strategies for on-line periodic testing of embedded processors , 2004 .

[5]  Georges G. E. Gielen,et al.  Emerging Yield and Reliability Challenges in Nanometer CMOS Technologies , 2008, 2008 Design, Automation and Test in Europe.

[6]  Yervant Zorian,et al.  On-Line Testing for VLSI—A Compendium of Approaches , 1998, J. Electron. Test..

[7]  Victor P. Nelson Fault-tolerant computing: fundamental concepts , 1990, Computer.

[8]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[9]  Michail Maniatakos,et al.  Systematic Software-Based Self-Test for Pipelined Processors , 2008, IEEE Trans. Very Large Scale Integr. Syst..

[10]  Scott A. Mahlke,et al.  Architecting a reliable CMP switch architecture , 2007, TACO.

[11]  Onur Mutlu,et al.  Operating system scheduling for efficient online self-test in robust systems , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[12]  Subhasish Mitra,et al.  CASP: Concurrent Autonomous Chip Self-Test Using Stored Test Patterns , 2008, 2008 Design, Automation and Test in Europe.

[13]  A. Arulmurugan,et al.  Survey of low power testing of VLSI circuits , 2012, 2012 International Conference on Computer Communication and Informatics.

[14]  Constantin Halatsis,et al.  Optimal Periodic Testing of Intermittent Faults In Embedded Pipelined Processor Applications , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[15]  Albert Meixner,et al.  Argus: Low-Cost, Comprehensive Error Detection in Simple Cores , 2008, IEEE Micro.

[16]  Josep Torrellas,et al.  ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.

[17]  Todd M. Austin,et al.  Ultra low-cost defect protection for microprocessor pipelines , 2006, ASPLOS XII.

[18]  Koushik Chakraborty,et al.  Adapting to intermittent faults in multicore systems , 2008, ASPLOS.

[19]  Yervant Zorian,et al.  Principles of testing electronic systems , 2000 .

[20]  Trevor Mudge,et al.  Razor: a low-power pipeline based on circuit-level timing speculation , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[21]  Hiroaki Inoue,et al.  VAST: Virtualization-Assisted Concurrent Autonomous Self-Test , 2008, 2008 IEEE International Test Conference.

[22]  Cristian Constantinescu,et al.  Impact of deep submicron technology on dependability of VLSI circuits , 2002, Proceedings International Conference on Dependable Systems and Networks.

[23]  Albert Meixner,et al.  A: L-C, C E D S C , 2008 .

[24]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[25]  Dirk Timmermann,et al.  Scheduling coprocessor for enhanced least-laxity-first scheduling in hard real-time systems , 1999, Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS'99.

[26]  Israel Koren,et al.  A Continuous-Parameter Markov Model and Detection Procedures for Intermittent Faults , 1978, IEEE Transactions on Computers.