ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors
暂无分享,去创建一个
[1] Satoshi Hoshina,et al. Fault recovery mechanism for multiprocessor servers , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.
[2] Kai Li,et al. Diskless Checkpointing , 1998, IEEE Trans. Parallel Distributed Syst..
[3] Abraham Silberschatz,et al. Database System Concepts, 3rd Edition , 1991 .
[4] Randy H. Katz,et al. A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.
[5] H KatzRandy,et al. A case for redundant arrays of inexpensive disks (RAID) , 1988 .
[6] Anne-Marie Kermarrec,et al. A recoverable distributed shared memory integrating coherence and recoverability , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[7] Kai Li,et al. Faster checkpointing with N+1 parity , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.
[8] Mark Horowitz,et al. Hardware Fault Containment In Scalable Shared-memory Multiprocessors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[9] Janak H. Patel,et al. Error Recovery in Shared Memory Multiprocessors Using Private Caches , 1990, IEEE Trans. Parallel Distributed Syst..
[10] Eric Rotenberg,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[11] James S. Plank,et al. Experimental assessment of workstation failures and their impact on checkpointing systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[12] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .
[13] Kevin Skadron,et al. Proceedings 29th Annual International Symposium on Computer Architecture , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[14] Anne-Marie Kermarrec,et al. An Efficient and Scalable Approach for Implementing Fault-Tolerant DSM Architectures , 2000, IEEE Trans. Computers.
[15] Willy Zwaenepoel,et al. Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.
[16] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[17] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[18] Josep Torrellas,et al. A direct-execution framework for fast and accurate simulation of superscalar processors , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[19] Min Xu,et al. Evaluating Non-deterministic Multi-threaded Commercial Workloads , 2001 .
[20] Abraham Silberschatz,et al. Database System Concepts , 1980 .
[21] Michel Banâtre,et al. Cache management in a tightly coupled fault tolerant multiprocessor , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.
[22] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[23] Shubhendu S. Mukherjee,et al. Detailed design and evaluation of redundant multithreading alternatives , 2002, ISCA.
[24] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[25] Michael J. Flynn,et al. Multiprocessor architecture using an audit trail for fault tolerance , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[26] Anne-Marie Kermarrec,et al. COMA: An Opportunity for Building Fault-Tolerant Scalable Shared Memory Multiprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[27] Rana Ejaz Ahmed,et al. Cache-aided rollback error recovery (CARER) algorithm for shared-memory multiprocessor systems , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.
[28] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[29] Kai Li,et al. Memory Exclusion: Optimizing the Performance of Checkpointing Systems , 1999, Softw. Pract. Exp..
[30] Christine Morin,et al. An Architecture for Tolerating Processor Failures in Shared Memory Multiprocessors , 1996, IEEE Trans. Computers.
[31] Liviu Iftode,et al. Scalable Fault-Tolerant Distributed Shared Memory , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[32] Brian Randell,et al. System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.