Fault-tolerance for macro dataflow parallel computations on grid
暂无分享,去创建一个
[1] Erik Maehle,et al. Fault-Tolerant Dynamic Task Scheduling Based on Dataflow Graphs , 1998 .
[2] Michael A. Bender,et al. Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk , 2002, SPAA '00.
[3] Francine Berman,et al. Overview of the Book: Grid Computing – Making the Global Infrastructure a Reality , 2003 .
[4] Miron Livny,et al. Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System , 1997 .
[5] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[6] C. Siva Ram Murthy,et al. A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis , 1998, IEEE Trans. Parallel Distributed Syst..
[7] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[8] Gerson G. H. Cavalheiro,et al. Athapascan-1: On-line building data flow graph in a parallel language , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[9] Harrick M. Vin,et al. Egida: an extensible toolkit for low-overhead fault-tolerance , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[10] Lorenzo Alvisi,et al. Reasons for a pessimistic or optimistic message logging protocol in MPI uncoordinated failure, recovery , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[11] Peter Steenkiste,et al. Fail-Safe PVM: A Portable Package for Distributed Programming with Transparent Recovery , 1993 .
[12] Partha Dasgupta,et al. Distributed Cactus Stacks: Runtime Stack-Sharing Support for Distributed Parallel Programs , 1998 .
[13] Thomas Hérault,et al. MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes , 2002, ACM/IEEE SC 2002 Conference (SC'02).