Process/thread migration and checkpointing in heterogeneous distributed systems

Process/thread migration and checkpointing are indispensable for resource sharing, cycle stealing, and other modes of interaction. To provide a flexible, transparent, and portable solution in heterogeneous environments, we have developed a multi-grained migration/checkpointing package, MigThread, which can migrate/checkpoint multiple threads to different machines or file systems simultaneously, and also perform single coarse-grained process migration/checkpointing. For scalability and portability, computation states are extracted out of their original places and abstracted to the language level. With the user-level stack/heap management, MigThread does not rely on any thread libraries and operating systems. For heterogeneity, a novel data conversion scheme is proposed to analyze data types automatically and convert data only on the receiver side. For safety, MigThread detects and overcomes "unsafe" factors to qualify virtually all C programs for migration/checkpointing. Some performance measurements are given to illustrate its effectiveness.

[1]  Assaf Schuster,et al.  Thread migration and its applications in distributed shared memory systems , 1998, J. Syst. Softw..

[2]  Vernon Rego,et al.  Arachne: A Portable Threads System Supporting Migrant Threads on Heterogeneous Network Farms , 1998, IEEE Trans. Parallel Distributed Syst..

[3]  Andrew P. Black,et al.  Fine-grained mobility in the Emerald system , 1987, TOCS.

[4]  Hai Jiang,et al.  On Improving Thread Migration: Safety and Performance , 2002, HiPC.

[5]  Peter Smith,et al.  Heterogeneous process migration: the Tui system , 1998, Softw. Pract. Exp..

[6]  Dejan S. Milojicic,et al.  Process migration , 1999, ACM Comput. Surv..

[7]  Miron Livny,et al.  Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System , 1997 .

[8]  Hai Jiang,et al.  Compile/run-time support for thread migration , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[9]  Xian-He Sun,et al.  Data collection and restoration for heterogeneous process migration , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[10]  Marvin Theimer,et al.  Heterogeneous process migration by recompilation , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[11]  Raj Srinivasan,et al.  XDR: External Data Representation Standard , 1995, RFC.

[12]  Edward Mascarenhas,et al.  Ariadne: Architecture of a Portable Threads system supporting Mobile Processes , 1995 .

[13]  Danny B. Lange,et al.  Programming Mobile Agents in Java with the Java Aglet API , 1997 .

[14]  Charles M. Shub Native code process-originated migration in a heterogeneous environment , 1990, CSC '90.

[15]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[16]  Adam J. Ferrari,et al.  Process Introspection: A Checkpoint Mechanism for High Performance Heterogeneous Distributed Systems , 1996 .

[17]  Henk Sol,et al.  Proceedings of the 54th Hawaii International Conference on System Sciences , 1997, HICSS 2015.

[18]  Hai Jiang,et al.  Data conversion for process/thread migration and checkpointing , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[19]  H. Zhou,et al.  "Receiver makes right" data conversion in PVM , 1995, Proceedings International Phoenix Conference on Computers and Communications.