Providing Non-stop Service for Message-Passing Based Parallel Applications with RADIC
暂无分享,去创建一个
[1] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .
[2] Bianca Schroeder,et al. Understanding failures in petascale computers , 2007 .
[3] Anthony Skjellum,et al. Using MPI - portable parallel programming with the message-parsing interface , 1994 .
[4] Emilio Luque,et al. An Intelligent Management of Fault Tolerance in Cluster Using RADICMPI , 2006, PVM/MPI.
[5] Thomas Hérault,et al. MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI , 2006, Int. J. High Perform. Comput. Appl..
[6] Takashi Nanya,et al. Evaluation of Checkpointing Mechanism on SCore Cluster System , 2003 .
[7] William Gropp,et al. Reproducible Measurements of MPI Performance Characteristics , 1999, PVM/MPI.
[8] Jack Dongarra,et al. Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.
[9] Emilio Luque,et al. Increasing the cluster availability using RADIC , 2006, 2006 IEEE International Conference on Cluster Computing.
[10] Pankaj Jalote,et al. Fault tolerance in distributed systems , 1994 .
[11] Zhiling Lan,et al. Exploit failure prediction for adaptive fault-tolerance in cluster computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).