Near Overhead-free Heterogeneous Thread-migration

Thread migration moves a single call-stack to another machine to improve either load balancing or locality. Current approaches for checkpointing and thread migration are either not heterogeneous or they introduce large runtime overhead. In general, previous approaches add overhead by instrumenting each function in a program. The instrumentation costs are then even incurred when no thread migration is performed. In this respect our system is near-overhead free: nearly no overhead is caused if no migration is performed. Our implementation instead generates meta-functions for each location in the code where a function is called. These functions portably save and rebuild activation records to and from a machine-independent format. Each variable of an activation record is described in terms of its usages in a machine-independent `usage descriptor string' to enable heterogeneous, near overhead free thread migration with as few as possible changes to a compiler. Our resulting thread migration solution is, for example, able to move a thread between an x86 machine (few registers, 32 bits) and an Itanium machine (many registers, 64 bits). Furthermore, we (optionally) move the decision on when and where to migrate to the application programmer instead of implementing a fixed 'fits-all' heuristics as in previous approaches

[1]  B. Ramkumar,et al.  Portable checkpointing for heterogeneous architectures , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.

[2]  Peter Smith,et al.  Heterogeneous process migration: the Tui system , 1998, Softw. Pract. Exp..

[3]  L.A. Smith,et al.  A Parallel Java Grande Benchmark Suite , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[4]  Andrew S. Tanenbaum,et al.  A practical tool kit for making portable compilers , 1983, Commun. ACM.

[5]  W. Kent Fuchs,et al.  PREACHES-portable recovery and checkpointing in heterogeneous systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[6]  Kai Li,et al.  Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.

[7]  Sara Bouchenak,et al.  Techniques for implementing efficient Java thread serialization , 2003 .

[8]  Erik Seligman,et al.  Dome: parallel programming in a distributed computing environment , 1996, Proceedings of International Conference on Parallel Processing.

[9]  W. Kent Fuchs,et al.  Compiler‐assisted full checkpointing , 1994, Softw. Pract. Exp..

[10]  Henri E. Bal,et al.  Runtime optimizations for a Java DSM implementation , 2001, JGI '01.

[11]  Cho-Li Wang,et al.  A new transparent java thread migration system using just-in-time recompilation , 2004 .

[12]  Micah Beck,et al.  Compiler-Assisted Memory Exclusion for Fast Checkpointing , 1995 .