PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures
暂无分享,去创建一个
Tipp Moseley | Vijay Janapa Reddi | Daniel A. Connors | Alex Shye | Joseph Blomstedt | V. Reddi | D. Connors | Tipp Moseley | Alex Shye | Joseph Blomstedt
[1] James L. Walsh,et al. IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..
[2] Paul Vickers,et al. Somersault Software Fault-Tolerance , 1998 .
[3] Martin Hiller,et al. Executable assertions for detecting data errors in embedded control systems , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.
[4] Kim M. Hazelwood,et al. SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[5] Hiroyuki Sugiyama,et al. A 1.3 GHz fifth generation SPARC64 microprocessor , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..
[6] Timothy J. Slegel,et al. IBM's S/390 G5 microprocessor design , 1999, IEEE Micro.
[7] Johan Karlsson,et al. Experimental evaluation of time-redundant execution for a brake-by-wire application , 2002, Proceedings International Conference on Dependable Systems and Networks.
[8] Sanjay J. Patel,et al. Y-branches: when you come to a fork in the road, take it , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[9] Irith Pomeranz,et al. Transient-fault recovery for chip multiprocessors , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[10] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[11] Joel S. Emer,et al. Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[12] Neeraj Suri,et al. On the placement of software mechanisms for detection of data errors , 2002, Proceedings International Conference on Dependable Systems and Networks.
[13] Y. C. Yeh,et al. Triple-triple redundant 777 primary flight computer , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.
[14] Edward J. McCluskey,et al. Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..
[15] David I. August,et al. SWIFT: software implemented fault tolerance , 2005, International Symposium on Code Generation and Optimization.
[16] James E. Smith,et al. Virtual machines - versatile platforms for systems and processes , 2005 .
[17] Shubhendu S. Mukherjee,et al. Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.
[18] Cheng Wang,et al. Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[19] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[20] Algirdas Avizienis,et al. The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.
[21] Shubhendu S. Mukherjee,et al. Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[22] Dirk Grunwald,et al. Shadow Profiling: Hiding Instrumentation Costs with Parallelism , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[23] Anthony Skjellum,et al. MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware , 2004, Cluster Computing.
[24] Ravishankar K. Iyer,et al. Application-based metrics for strategic placement of detectors , 2005, 11th Pacific Rim International Symposium on Dependable Computing (PRDC'05).
[25] N. Hengartner,et al. Predicting the number of fatal soft errors in Los Alamos national laboratory's ASC Q supercomputer , 2005, IEEE Transactions on Device and Materials Reliability.
[26] Irith Pomeranz,et al. Transient-fault recovery using simultaneous multithreading , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[27] Wolfgang Graetsch,et al. Fault tolerance under UNIX , 1989, TOCS.
[28] Ravishankar K. Iyer,et al. Chameleon: A Software Infrastructure for Adaptive Fault Tolerance , 1999, IEEE Trans. Parallel Distributed Syst..
[29] Thomas C. Bressoud,et al. TFT: a software system for application-transparent fault tolerance , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[30] Fred B. Schneider,et al. Hypervisor-based fault tolerance , 1996, TOCS.
[31] Sanjay J. Patel,et al. Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.
[32] Tipp Moseley,et al. Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).
[33] J. Goldberg,et al. SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.
[34] Derek Bruening,et al. Maintaining consistency and bounding capacity of software code caches , 2005, International Symposium on Code Generation and Optimization.
[35] Eric Rotenberg,et al. A study of slipstream processors , 2000, MICRO 33.
[36] Richard D. Schlichting,et al. Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.
[37] T. N. Vijaykumar,et al. Opportunistic transient-fault detection , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[38] K. Soumyanath,et al. Scaling trends of cosmic ray induced soft errors in static latches beyond 0.18 /spl mu/ , 2001, 2001 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.01CH37185).
[39] B. Zorn,et al. Exterminator : Automatically Correcting Memory Errors , 2006 .
[40] K. Sundaramoorthy,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[41] Emery D. Berger,et al. Exterminator: automatically correcting memory errors with high probability , 2007, PLDI '07.
[42] Robert W. Horst,et al. Multiple instruction issue in the NonStop cyclone processor , 1990, ISCA '90.
[43] Edward J. McCluskey,et al. Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..
[44] Emery D. Berger,et al. DieHard: probabilistic memory safety for unsafe languages , 2006, PLDI '06.
[45] Zizhong Chen,et al. Process Fault Tolerance: Semantics, Design and Applications for High Performance Computing , 2005, Int. J. High Perform. Comput. Appl..