StackThreads/MP: integrating futures into calling standards

An implementation scheme of fine-grain multithreading that needs no changes to current calling standards for sequential languages and modest extensions to sequential compilers is described. Like previous similar systems, it performs an asynchronous call as if it were an ordinary procedure call, and detaches the callee from the caller when the callee suspends or either of them migrates to another processor. Unlike previous similar systems, it detaches and connects arbitrary frames generated by off-the-shelf sequential compilers obeying calling standards. As a consequence, it requires neither a frontend preprocessor nor a native code generator that has a builtin notion of parallelism. The system practically works with unmodified GNU C compiler (GCC). Desirable extensions to sequential compilers for guaranteeing portability and correctness of the scheme are clarified and claimed modest. Experiments indicate that sequential performance is not sacrificed for practical applications and both sequential and parallel performance are comparable to Cilk[8], whose current implementation requires a fairly sophisticated preprocessor to C. These results show that efficient asynchronous calls (a.k.a. future calls) can be integrated into current calling standard with a very small impact both on sequential performance and compiler engineering.

[1]  Marc Feeley,et al.  A Message Passing Implementation of Lazy Task Creation , 1992, Parallel Symbolic Computing.

[2]  Akinori Yonezawa,et al.  Fine-grain multithreading with minimal compiler support—a cost effective approach to implementing efficient multithreading languages , 1997, PLDI '97.

[3]  Akinori Yonezawa,et al.  An Efficient Compilation Framework for Languages Based on a Concurrent Process Calculus , 1997, Euro-Par.

[4]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[5]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[6]  Devang Shah,et al.  Programming with threads , 1996 .

[7]  Kazunori Ueda,et al.  Design of the Kernel Language for the Parallel Inference Machine , 1990, Computer/law journal.

[8]  A. Yonezawa,et al.  StackThreads/MP: Integrating Futures into Calling Standards AUTHORS , 1999 .

[9]  Akinori Yonezawa,et al.  Schematic: A Concurrent Object-Oriented Extension to Scheme , 1995, OBPDC.

[10]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.

[11]  Marc Feeley Polling efficiently on stock hardware , 1993, FPCA '93.

[12]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[13]  Richard M. Stallman,et al.  Using and Porting GNU CC , 1998 .

[14]  Anne Rogers,et al.  Supporting dynamic data structures on distributed-memory machines , 1995, TOPL.

[15]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[16]  Rishiyur S. Nikhil Arvind,et al.  Id: a language with implicit parallelism , 1992 .

[17]  Xingbin Zhang,et al.  A Hybrid Execution Model for Fine-Grained Languages on Distributed Memory Multicomputers , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[18]  Satoshi Matsuoka,et al.  StackThreads: An Abstract Machine for Scheduling Fine-Grain Threads on Stock CPUs , 1994, Theory and Practice of Parallel Programming.

[19]  Seth Copen Goldstein,et al.  Lazy Threads: Implementing a Fast Parallel Call , 1996, J. Parallel Distributed Comput..

[20]  Mario Tokoro,et al.  Object-oriented concurrent programming , 1987 .

[21]  Andrew A. Chien,et al.  ICC++-AC++ Dialect for High Performance Parallel Computing , 1996, ISOTAS.

[22]  Luca Cardelli,et al.  Modula-3 Report (revised) , 1992 .

[23]  Andrew A. Chien,et al.  ICC++—a C++ dialect for high performance parallel computing , 1996, SIAP.

[24]  Marc Feeley,et al.  An efficient and general implementation of futures on large scale shared-memory multiprocessors , 1993 .

[25]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, IEEE Trans. Parallel Distributed Syst..