Fault oblivious high performance computing with dynamic task replication and substitution
暂无分享,去创建一个
Yevgeniy Vorobeychik | Ronald Minnich | Robert C. Armstrong | Jackson Mayo | Don W. Rudish | R. Minnich | R. Armstrong | J. Mayo | Yevgeniy Vorobeychik
[1] John F. Karpovich,et al. Fault Tolerance via Replication in Coarse Grain Data-Flow , 1995, PSLS.
[2] Ümit V. Çatalyürek,et al. Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[3] Jackson Mayo,et al. Methodologies for advance warning of compute cluster problems via statistical analysis: a case study , 2009, Resilience '09.
[4] John Daly. A Model for Predicting the Optimum Checkpoint Interval for Restart Dumps , 2003, International Conference on Computational Science.
[5] John F. Karpovich,et al. Fault-Tolerance in Coarse Grain Data Flow , 1995 .
[6] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[7] Lyn C. Thomas,et al. Serial and parallel value iteration algorithms for discounted Markov decision processes , 1993 .
[8] Jack Dongarra,et al. Computational Science — ICCS 2003 , 2003, Lecture Notes in Computer Science.
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[10] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[11] Andrew G. Barto,et al. Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms , 1993, NIPS.
[12] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.