Propitious Checkpoint Intervals to Improve System Performance
暂无分享,去创建一个
[1] Larry Rudolph,et al. Cooperative checkpointing: a robust approach to large-scale systems reliability , 2006, ICS '06.
[2] James S. Plank,et al. Experimental assessment of workstation failures and their impact on checkpointing systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).
[3] Nitin H. Vaidya,et al. Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme , 1997, IEEE Trans. Computers.
[4] Ravishankar K. Iyer,et al. Modeling coordinated checkpointing for large-scale supercomputers , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[5] Anand Sivasubramaniam,et al. Filtering failure logs for a BlueGene/L prototype , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[6] Ron A. Oldfield. Lightweight storage and overlay networks for fault tolerance. , 2006 .
[7] James S. Plank,et al. Processor Allocation and Checkpoint Interval Selection in Cluster Computing Systems , 2001, J. Parallel Distributed Comput..
[8] John W. Young,et al. A first order approximation to the optimum checkpoint interval , 1974, CACM.
[9] Seetharami R. Seelam,et al. Modeling the Impact of Checkpoints on Next-Generation Systems , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).
[10] John T. Daly,et al. A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..
[11] Larry Rudolph,et al. Cooperative checkpointing theory , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[12] David S. Greenberg,et al. A System Software Architecture for High End Computing , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[13] R. Vilalta,et al. Providing Persistent and Consistent Resources through Event Log Analysis and Predictions for Large-scale Computing Systems , 2002 .
[14] Mark S. Squillante,et al. Failure data analysis of a large-scale heterogeneous server environment , 2004, International Conference on Dependable Systems and Networks, 2004.
[15] Jack Dongarra,et al. Fault tolerant matrix operations for networks of workstations using multiple checkpointing , 1997, Proceedings High Performance Computing on the Information Superhighway. HPC Asia '97.
[16] Alan D. George,et al. Optimization of checkpointing-related I/O for high-performance parallel and distributed computing , 2007, The Journal of Supercomputing.
[17] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[18] William H. Sanders,et al. Performance analysis of two time-based coordinated checkpointing protocols , 1997, Proceedings Pacific Rim International Symposium on Fault-Tolerant Systems.
[19] E. N. Elnozahy,et al. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery , 2004, IEEE Transactions on Dependable and Secure Computing.