Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance
暂无分享,去创建一个
[1] Shubhendu S. Mukherjee,et al. Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[2] Dirk Grunwald,et al. Shadow Profiling: Hiding Instrumentation Costs with Parallelism , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[3] Ravishankar K. Iyer,et al. Application-based metrics for strategic placement of detectors , 2005, 11th Pacific Rim International Symposium on Dependable Computing (PRDC'05).
[4] T. N. Vijaykumar,et al. Opportunistic Transient-Fault Detection , 2005, ISCA 2005.
[5] Joel S. Emer,et al. Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[6] N. Hengartner,et al. Predicting the number of fatal soft errors in Los Alamos national laboratory's ASC Q supercomputer , 2005, IEEE Transactions on Device and Materials Reliability.
[7] Cheng Wang,et al. Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[8] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[9] David I. August,et al. SWIFT: software implemented fault tolerance , 2005, International Symposium on Code Generation and Optimization.
[10] J. Goldberg,et al. SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.
[11] Derek Bruening,et al. Maintaining consistency and bounding capacity of software code caches , 2005, International Symposium on Code Generation and Optimization.
[12] Edward J. McCluskey,et al. Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..
[13] Emery D. Berger,et al. DieHard: probabilistic memory safety for unsafe languages , 2006, PLDI '06.
[14] Robert W. Horst,et al. Multiple instruction issue in the NonStop cyclone processor , 1990, ISCA '90.
[15] Shubhendu S. Mukherjee,et al. Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.
[16] Algirdas Avizienis,et al. The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.
[17] Thomas C. Bressoud,et al. TFT: a software system for application-transparent fault tolerance , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[18] Fred B. Schneider,et al. Hypervisor-based fault tolerance , 1995, TOCS.
[19] Sanjay J. Patel,et al. Y-branches: when you come to a fork in the road, take it , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[20] Paul Vickers,et al. Somersault Software Fault-Tolerance , 1998 .
[21] Y. C. Yeh,et al. Triple-triple redundant 777 primary flight computer , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.
[22] K. Soumyanath,et al. Scaling trends of cosmic ray induced soft errors in static latches beyond 0.18 /spl mu/ , 2001, 2001 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.01CH37185).
[23] K. Sundaramoorthy,et al. Slipstream processors: improving both performance and fault tolerance , 2000, ASPLOS IX.
[24] H. Ando,et al. A 1.3GHz fifth generation SPARC64 microprocessor , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).
[25] Edward J. McCluskey,et al. Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..
[26] Timothy J. Slegel,et al. IBM's S/390 G5 microprocessor design , 1999, IEEE Micro.
[27] Robert W. Horst,et al. Multiple instruction issue in the NonStop Cyclone processor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[28] Martin Hiller,et al. Executable assertions for detecting data errors in embedded control systems , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.
[29] James L. Walsh,et al. IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..
[30] Johan Karlsson,et al. Experimental evaluation of time-redundant execution for a brake-by-wire application , 2002, Proceedings International Conference on Dependable Systems and Networks.
[31] Wolfgang Graetsch,et al. Fault tolerance under UNIX , 1989, TOCS.
[32] Ravishankar K. Iyer,et al. Chameleon: A Software Infrastructure for Adaptive Fault Tolerance , 1999, IEEE Trans. Parallel Distributed Syst..
[33] Irith Pomeranz,et al. Transient-fault recovery for chip multiprocessors , 2003, ISCA '03.
[34] Neeraj Suri,et al. On the placement of software mechanisms for detection of data errors , 2002, Proceedings International Conference on Dependable Systems and Networks.
[35] Richard D. Schlichting,et al. Fail-stop processors: an approach to designing fault-tolerant computing systems , 1981, TOCS.
[36] Irith Pomeranz,et al. Transient-fault recovery using simultaneous multithreading , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.