To checkpoint or not to checkpoint: Understanding energy-performance-I/O tradeoffs in HPC checkpointing
暂无分享,去创建一个
[1] Thomas Hérault,et al. Optimal Checkpointing Period: Time vs. Energy , 2013, PMBS@SC.
[2] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.
[3] Matthias S. Müller,et al. Quantifying power consumption variations of HPC systems using SPEC MPI benchmarks , 2010, Computer Science - Research and Development.
[4] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..
[5] Rolf Riesen,et al. Evaluating energy savings for checkpoint/restart , 2013, E2SC '13.
[6] John W. Young,et al. A first order approximation to the optimum checkpoint interval , 1974, CACM.
[7] Franck Cappello,et al. Energy considerations in checkpointing and fault tolerance protocols , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012).
[8] Matthew L. Curry,et al. Power use of disk subsystems in supercomputers , 2011, PDSW '11.
[9] Franck Cappello,et al. ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance protocols during HPC executions , 2013, CCGrid 2013.
[10] Laxmikant V. Kalé,et al. Assessing Energy Efficiency of Fault Tolerance Protocols for HPC Systems , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.
[11] Bianca Schroeder,et al. Checkpoint/restart in practice: When ‘simple is better’ , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).
[12] John T. Daly,et al. A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..
[13] Satoshi Matsuoka,et al. Energy-aware I/O optimization for checkpoint and restart on a NAND flash memory system , 2013, FTXS '13.
[14] W. Marsden. I and J , 2012 .