A survey on software checkpointing and mobility techniques in distributed systems
暂无分享,去创建一个
[1] Lorenzo Alvisi,et al. Message logging: pessimistic, optimistic, and causal , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.
[2] Achour Mostéfaoui,et al. A communication-induced checkpointing protocol that ensures rollback-dependency trackability , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.
[3] Yin-Min Wang,et al. Consistent Global checkpoints that Contain a Given Set of Local Chekpoints , 1997, IEEE Trans. Computers.
[4] Axel W. Krings,et al. A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing , 2005, Euro-Par.
[5] Thomas Hérault,et al. Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid , 2005, Future Gener. Comput. Syst..
[6] Danny B. Lange,et al. Programming and Deploying Java¿ Mobile Agents with Aglets¿ , 1998 .
[7] Achour Mostéfaoui,et al. Virtual Precedence in Asynchronous Systems: Cencept and Applications , 1997, WDAG.
[8] Henri E. Bal,et al. Transparent Fault Tolerance for Grid Applications , 2005, EGC.
[9] Georg Stellner,et al. CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.
[10] Boleslaw K. Szymanski,et al. The Internet Operating System: Middleware for Adaptive Distributed Computing , 2006, Int. J. High Perform. Comput. Appl..
[11] Roberto Baldoni,et al. An Index-Based Checkpointing Algorithm for Autonomous Distributed Systems , 1999, IEEE Trans. Parallel Distributed Syst..
[12] Liang Chen,et al. Supporting fault-tolerance in streaming grid applications , 2007, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[13] C. Reich,et al. Engineering an Autonomic Container for WSRF-Based Web Services , 2007, 15th International Conference on Advanced Computing and Communications (ADCOM 2007).
[14] David Sinreich,et al. An architectural blueprint for autonomic computing , 2006 .
[15] Thomas Hérault,et al. MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI , 2006, Int. J. High Perform. Comput. Appl..
[16] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.
[17] Nitin H. Vaidya,et al. Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme , 1997, IEEE Trans. Computers.
[18] D. Nurmi. Model-Based Checkpoint Scheduling for Volatile Resource Environments , 2004 .
[19] Jason Maassen,et al. Fault-Tolerant Scheduling of Fine-Grained Tasks in Grid Environments , 2006, Int. J. High Perform. Comput. Appl..
[20] Andrzej M. Goscinski,et al. Self Healing and Self Configuration in a WSRF Grid Environment , 2005, ICA3PP.
[21] Franck Cappello,et al. Coordinated checkpoint versus message log for fault tolerant MPI , 2004, 2003 Proceedings IEEE International Conference on Cluster Computing.
[22] Fabio Kon,et al. Checkpointing BSP parallel applications on the InteGrade Grid middleware , 2006, Concurr. Comput. Pract. Exp..
[23] Mark A. Franklin,et al. Checkpointing in Distributed Computing Systems , 1996, J. Parallel Distributed Comput..
[24] James S. Plank,et al. Processor Allocation and Checkpoint Interval Selection in Cluster Computing Systems , 2001, J. Parallel Distributed Comput..
[25] Robert S. Gray,et al. Agent Tcl: a Exible and Secure Mobile-agent System , 1996 .
[26] Boleslaw K. Szymanski,et al. Towards a middleware framework for dynamically reconfigurable scietific computing , 2004, High Performance Computing Workshop.
[27] W. Kent Fuchs,et al. CATCH-compiler-assisted techniques for checkpointing , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.
[28] Marco Danelutto,et al. ASSIST As a Research Framework for High-Performance Grid Programming Environments , 2006, Grid Computing: Software Environments and Tools.
[29] Erol Gelenbe,et al. On the Optimum Checkpoint Interval , 1979, JACM.
[30] W. Kent Fuchs,et al. Consistent Global Checkpoints Based on Direct Dependency Tracking , 1994, Inf. Process. Lett..
[31] Francine Berman,et al. Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..
[32] Achour Mostéfaoui,et al. Communication-Induced Determination of Consistent Snapshots , 1999, IEEE Trans. Parallel Distributed Syst..
[33] John W. Young,et al. A first order approximation to the optimum checkpoint interval , 1974, CACM.
[34] József Kovács,et al. Transparent parallel checkpointing and migration in clusters and ClusterGrids , 2009, Int. J. Comput. Sci. Eng..
[35] Mohamed Jmaiel,et al. A serialization based approach for strong mobility of shared object , 2007, PPPJ.
[36] Brian Randell,et al. System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.
[37] Jack Dongarra,et al. Self adaptivity in Grid computing: Research Articles , 2005 .
[38] Kai Li,et al. Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.
[39] Roberto Baldoni,et al. Direct dependency-based determination of consistent global checkpoints , 2001, Comput. Syst. Sci. Eng..
[40] John Shalf,et al. The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment , 2001, Int. J. High Perform. Comput. Appl..
[41] W. Kent Fuchs,et al. Lazy checkpoint coordination for bounding rollback propagation , 1992, Proceedings of 1993 IEEE 12th Symposium on Reliable Distributed Systems.
[42] Paul Avery,et al. SPHINX: a fault-tolerant system for scheduling in dynamic grid environments , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[43] WangYi-Min. Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints , 1997 .
[44] Sathish S. Vadhiyar,et al. Self adaptivity in Grid computing , 2005, Concurr. Pract. Exp..
[45] Willy Zwaenepoel,et al. Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.
[46] William H. Sanders,et al. Performance analysis of two time-based coordinated checkpointing protocols , 1997, Proceedings Pacific Rim International Symposium on Fault-Tolerant Systems.
[47] Gian Pietro Picco,et al. Understanding code mobility , 1998, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.
[48] Lorenzo Alvisi,et al. An analysis of communication induced checkpointing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[49] Achour Mostéfaoui,et al. Preventing useless checkpoints in distributed computations , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.
[50] Fabio Kon,et al. Checkpointing BSP parallel applications on the InteGrade Grid middleware: Research Articles , 2006 .
[51] Robert E. Strom,et al. Optimistic recovery in distributed systems , 1985, TOCS.