Simulating Application Resilience at Exascale
暂无分享,去创建一个
[1] Rolf Riesen,et al. See applications run and throughput jump: The case for redundant computing in HPC , 2010, 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W).
[2] Rolf Riesen,et al. Communication patterns , 2006 .
[3] Seetharami R. Seelam,et al. Modeling the Impact of Checkpoints on Next-Generation Systems , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).
[4] Andrzej Goscinski,et al. A survey and review of the current state of rollback-recovery for cluster systems , 2009 .
[5] E. N. Elnozahy,et al. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery , 2004, IEEE Transactions on Dependable and Secure Computing.
[6] Keith D. Underwood,et al. The structural simulation toolkit: exploring novel architectures , 2006, SC.
[7] Bronis R. de Supinski,et al. Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Bruce Jacob,et al. The structural simulation toolkit , 2006, PERV.
[9] B R de Supinski,et al. Detailed Modeling, Design, and Evaluation of a Scalable Multi-level Checkpointing System , 2010 .
[10] Fernando M. A. Silva,et al. Efficient Parallel Subgraph Counting Using G-Tries , 2010, 2010 IEEE International Conference on Cluster Computing.
[11] Lorenzo Alvisi,et al. An analysis of communication induced checkpointing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[12] Rolf Riesen,et al. A framework for architecture-level power, area, and thermal simulation and its application to network-on-chip design exploration , 2011, PERV.