mSWAT: Low-cost hardware fault detection and diagnosis for multicore systems
暂无分享,去创建一个
Sarita V. Adve | Pradeep Ramachandran | Man-Lap Li | Byn Choi | Siva Kumar Sastry Hari | S. Adve | S. Hari | Pradeep Ramachandran | Man-Lap Li | Byn Choi
[1] David García,et al. NonStop/spl reg/ advanced architecture , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[2] Derek Hower,et al. Rerun: Exploiting Episodes for Lightweight Memory Race Recording , 2008, 2008 International Symposium on Computer Architecture.
[3] Shubhendu S. Mukherjee,et al. Perturbation-based Fault Screening , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[4] Sarita V. Adve,et al. Accurate microarchitecture-level fault modeling for studying hardware faults , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[5] Josep Torrellas,et al. DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, 2008 International Symposium on Computer Architecture.
[6] Sanjay J. Patel,et al. ReStore: symptom based soft error detection in microprocessors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[7] Min Xu,et al. A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.
[8] Shekhar Y. Borkar,et al. Design challenges of technology scaling , 1999, IEEE Micro.
[9] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[10] Josep Torrellas,et al. DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, International Symposium on Computer Architecture.
[11] Shekhar Y. Borkar,et al. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.
[12] Sule Ozev,et al. Online diagnosis of hard faults in microprocessors , 2007, TACO.
[13] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[14] Pradip Bose,et al. Exploiting structural duplication for lifetime reliability enhancement , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[15] Josep Torrellas,et al. ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.
[16] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[17] Eric Rotenberg,et al. Coverage of a microarchitecture-level fault check regimen in a superscalar processor , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).
[18] Lisa Spainhower,et al. IBM S/390 Parallel Enterprise Server G5 fault tolerance: A historical perspective , 1999, IBM J. Res. Dev..
[19] Sarita V. Adve,et al. Trace-based microarchitecture-level diagnosis of permanent hardware faults , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).
[20] John P. Hayes,et al. Low-cost on-line fault detection using control flow assertions , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..
[21] Todd M. Austin,et al. Ultra low-cost defect protection for microprocessor pipelines , 2006, ASPLOS XII.
[22] David A. Wood,et al. Full-system timing-first simulation , 2002, SIGMETRICS '02.
[23] Massimo Violante,et al. Soft-error detection using control flow assertions , 2003, Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems.
[24] Albert Meixner,et al. Argus: Low-Cost, Comprehensive Error Detection in Simple Cores , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[25] Todd M. Austin,et al. CrashTest: A fast high-fidelity FPGA-based resiliency analysis framework , 2008, 2008 IEEE International Conference on Computer Design.
[26] Ravishankar K. Iyer,et al. An Architectural Framework for Detecting Process Hangs/Crashes , 2005, EDCC.
[27] Eric Rotenberg,et al. Assertion-Based Microarchitecture Design for Improved Fault Tolerance , 2006, 2006 International Conference on Computer Design.
[28] Ravishankar K. Iyer,et al. Dynamic Derivation of Application-Specific Error Detectors and their Implementation in Hardware , 2006, 2006 Sixth European Dependable Computing Conference.
[29] Donald Yeung,et al. Application-Level Correctness and its Impact on Fault Tolerance , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[30] Sarita V. Adve,et al. Understanding the propagation of hard errors to software and implications for resilient system design , 2008, ASPLOS.
[31] Wi N Dows. FLIGHT DATA RECORDER FOR , 2007 .
[32] Amin Ansari,et al. The StageNet fabric for constructing resilient multicore systems , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[33] Satish Narayanasamy,et al. BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[34] Huiyang Zhou,et al. Unified Architectural Support for Soft-Error Protection or Software Bug Detection , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[35] Sarita V. Adve,et al. Using likely program invariants to detect hardware errors , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).
[36] Josep Torrellas,et al. ReViveI/O: efficient handling of I/O in highly-available rollback-recovery servers , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[37] Ravishankar K. Iyer,et al. An end-to-end approach for the automatic derivation of application-aware error detectors , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.