Supporting highly-decoupled thread-level redundancy for parallel programs
暂无分享,去创建一个
[1] Vivek De,et al. Measurements and analysis of SER-tolerant latch in a 90-nm dual-V/sub T/ CMOS process , 2004 .
[2] Satish Narayanasamy,et al. Recording shared memory dependencies using strata , 2006, ASPLOS XII.
[3] Norman P. Jouppi,et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.
[4] Joel S. Emer,et al. Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[5] Anoop Gupta,et al. Parallel computer architecture - a hardware / software approach , 1998 .
[6] Alan L. Cox,et al. TreadMarks: shared memory computing on networks of workstations , 1996 .
[7] Shubhendu S. Mukherjee,et al. Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] Sanjay J. Patel,et al. Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.
[9] J. von Neumann,et al. Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .
[10] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[11] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[12] Marc Tremblay,et al. The implementation and application of micro rollback in fault-tolerant VLSI systems , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[13] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[14] John A. Rohr. STAREX SELF-REPAIR ROUTINES: SOFTWARE RECOVERY IN THE JPL-STAR COMPUTER , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..
[15] Irith Pomeranz,et al. Transient-fault recovery for chip multiprocessors , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[16] Jill J. Hallenbeck,et al. Modulo 3 Residue Checker: New Results on Performance and Cost , 1988, IEEE Trans. Computers.
[17] Hiroyuki Sugiyama,et al. A 1.3 GHz fifth generation SPARC64 microprocessor , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..
[18] Timothy J. Slegel,et al. IBM's S/390 G5 microprocessor design , 1999, IEEE Micro.
[19] Babak Falsafi,et al. Fingerprinting: bounding soft-error-detection latency and bandwidth , 2004, IEEE Micro.
[20] Yuval Tamir,et al. Application-transparent process-level error recovery for multicomputers , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.
[21] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[22] David García,et al. NonStop/spl reg/ advanced architecture , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[23] Dhiraj K. Pradhan,et al. Fault-Tolerant Computing , 2008, Wiley Encyclopedia of Computer Science and Engineering.
[24] Janak H. Patel,et al. Concurrent Error Detection in Multiply and Divide Arrays , 1983, IEEE Transactions on Computers.
[25] Todd M. Austin,et al. Efficient checker processor design , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.
[26] Harry Muljono,et al. A 1.5-GHz 130-nm Itanium/sup /spl reg// 2 Processor with 6-MB on-die L3 cache , 2003 .
[27] W. W. Peterson. On Checking an Adder , 1958, IBM J. Res. Dev..
[28] Babak Falsafi,et al. Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[29] Timothy J. Dell,et al. A white paper on the benefits of chipkill-correct ecc for pc server main memory , 1997 .
[30] Josep Torrellas,et al. ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors , 2002, ISCA.
[31] Ram Chillarege,et al. IBM's ES/9000 Model 982's fault-tolerant design for consolidation , 1994, IEEE Micro.
[32] Balaram Sinharoy,et al. POWER5 system microarchitecture , 2005, IBM J. Res. Dev..
[33] Min Xu,et al. A regulated transitive reduction (RTR) for longer memory race recording , 2006, ASPLOS XII.
[34] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[35] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[36] Babak Falsafi,et al. Dual use of superscalar datapath for transient-fault detection and recovery , 2001, MICRO.
[37] Min Xu,et al. A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.
[38] Min Yinghua. Dependable Systems and Networks , 2001 .
[39] Andrea Bondavalli,et al. Efficient fault tolerance: an approach to deal with transient faults in multiprocessor architectures , 1994, Proceedings of 1994 International Conference on Parallel and Distributed Systems.
[40] Janak H. Patel,et al. Error Recovery in Shared Memory Multiprocessors Using Private Caches , 1990, IEEE Trans. Parallel Distributed Syst..
[41] Michael Dowd,et al. Designing A Single Board Computer For Space Using the Most Advanced Processor and Mitigation Technologies , 2004 .
[42] N. Ghani,et al. A Recovery Cache for the PDP-11 , 1980, IEEE Transactions on Computers.
[43] Lorenzo Alvisi,et al. Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.
[44] P. Eaton,et al. Soft error rate mitigation techniques for modern microcircuits , 2002, 2002 IEEE International Reliability Physics Symposium. Proceedings. 40th Annual (Cat. No.02CH37320).
[45] Babak Falsafi,et al. Reunion: Complexity-Effective Multicore Redundancy , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[46] J. Ziegler,et al. Effect of Cosmic Rays on Computer Memories , 1979, Science.
[47] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[48] R. Hokinson,et al. Historical trend in alpha-particle induced soft error rates of the Alpha/sup TM/ microprocessor , 2001, 2001 IEEE International Reliability Physics Symposium Proceedings. 39th Annual (Cat. No.00CH37167).
[49] N. Seifert,et al. Robust system design with built-in soft-error resilience , 2005, Computer.
[50] Irith Pomeranz,et al. Transient-fault recovery using simultaneous multithreading , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[51] Michael C. Huang,et al. Exploiting coarse-grain verification parallelism for power-efficient fault tolerance , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[52] Anand Sivasubramaniam,et al. A complexity-effective approach to ALU bandwidth enhancement for instruction-level temporal redundancy , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[53] Shubhendu S. Mukherjee,et al. Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.
[54] A A Schäffer,et al. Parallelization of general-linkage analysis problems. , 1994, Human heredity.
[55] Michael Nicolaidis. Time redundancy based soft-error tolerance to rescue nanometer technologies , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).
[56] Christos A. Papachristou,et al. An efficient BICS design for SEUs detection and correction in semiconductor memories , 2005, Design, Automation and Test in Europe.
[57] Kang G. Shin,et al. Design and Evaluation of a Fault-Tolerant Multiprocessor Using Hardware Recovery Blocks , 1984, IEEE Transactions on Computers.
[58] Philip A. Bernstein,et al. Sequoia: a fault-tolerant tightly coupled multiprocessor for transaction processing , 1988, Computer.
[59] K. Soumyanath,et al. Measurements and analysis of SER tolerant latch in a 90 nm dual-Vt CMOS process , 2003, Proceedings of the IEEE 2003 Custom Integrated Circuits Conference, 2003..
[60] Babak Falsafi,et al. TRUSS: a reliable, scalable server architecture , 2005, IEEE Micro.
[61] Compilation Techniques,et al. Parallel architectures and compilation techniques , 1995 .
[62] Changhong Dai,et al. Impact of CMOS process scaling and SOI on the soft error rates of logic processes , 2001, 2001 Symposium on VLSI Technology. Digest of Technical Papers (IEEE Cat. No.01 CH37184).