Monsoon: an explicit token-store architecture

Dataflow architectures tolerate long unpredictable communication delays and support generation and coordination of parallel activities directly in hardware, rather than assuming that program mapping will cause these issues to disappear. However, the proposed mechanisms are complex and introduce new mapping complications. This paper presents a greatly simplified approach to dataflow execution, called the explicit token store (ETS) architecture, and its current realization in Monsoon. The essence of dynamic dataflow execution is captured by a simple transition on state bits associated with storage local to a processor. Low-level storage management is performed by the compiler in assigning nodes to slots in an activation frame, rather than dynamically in hardware. The processor is simple, highly pipelined, and quite general. It may be viewed as a generalization of a fairly primitive von Neumann architecture. Although the addressing capability is restrictive, there is exactly one instruction executed for each action on the dataflow graph. Thus, the machine oriented ETS model provides new understanding of the merits and the real cost of direct execution of dataflow graphs.

[1]  A. Gupta,et al.  Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results , 1989, ISCA '89.

[2]  K. S. Weng AN ABSTRACT IMPLEMENTATION FOR A GENERALIZED DATA FLOW LANGUAGE , 1980 .

[3]  Arvind,et al.  T: A Multithreaded Massively Parallel Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[4]  Arvind,et al.  Future Scientific Programming on Parallel Machines , 1988, J. Parallel Distributed Comput..

[5]  M Ben Ahmed,et al.  Programming parallel processors , 1990 .

[6]  V. G. Grafe,et al.  Eps'88: Combining the best features of von Neumann and dataflow computing , 1989 .

[7]  James E. Rumbaugh,et al.  A Data Flow Multiprocessor , 1977, IEEE Transactions on Computers.

[8]  K. Ekanadham,et al.  The price of asynchronous parallelism: an analysis of dataflow architectures , 1989 .

[9]  Robert E. Benner,et al.  Development of Parallel Methods for a $1024$-Processor Hypercube , 1988 .

[10]  David E. Culler,et al.  Assessing the benefits of fine-grain parallelism in dataflow programs , 1988, Proceedings. SUPERCOMPUTING '88.

[11]  Seth Copen Goldstein,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[12]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[13]  Kenneth R. Traub,et al.  Multithreading: a revisionist view of dataflow architectures , 1991, ISCA '91.

[14]  Arvind,et al.  Tagged token dataflow architecture , 1983 .

[15]  K. R. Traub,et al.  A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE , 1986 .

[16]  John Cocke,et al.  A methodology for the real world , 1981 .

[17]  David E. Culler,et al.  Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine , 1991, ASPLOS IV.

[18]  Robert M. Keller,et al.  A loosely-coupled applicative multi-processing system* , 1979, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[19]  John Cocke,et al.  Register Allocation Via Coloring , 1981, Comput. Lang..

[20]  Anoop Gupta,et al.  Exploring The Benefits Of Multiple Hardware Contexts In A Multiprocessor Architecture: Preliminary Results , 1989, The 16th Annual International Symposium on Computer Architecture.

[21]  William J. Dally,et al.  The M-machine multicomputer , 1997, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[22]  David A. Padua,et al.  A Second Opinion on Data Flow Machines and Languages , 1982, Computer.

[23]  R. S. Nikhil Can dataflow subsume von Neumann computing? , 1989, ISCA '89.

[24]  Gregory M. Papadopoulus,et al.  Implementation of a general-purpose dataflow multiprocessor , 1991, Research monographs in parallel and distributed computing.

[25]  Simon Kahan,et al.  Tera Hardware Software Cooperation , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[26]  Robert A. Iannucci,et al.  A dataflow/von Neumann hybrid architecture , 1988 .

[27]  Jack B. Dennis,et al.  A preliminary architecture for a basic data-flow processor , 1974, ISCA '98.

[28]  David E. Culler,et al.  Managing resources in a parallel machine , 1986 .

[29]  Jack B. Dennis,et al.  Data Flow Supercomputers , 1980, Computer.

[30]  Robert M. Keller,et al.  Simulated Performance of a Reduction-Based Multiprocessor , 1984, Computer.

[31]  Gurindar S. Sohi,et al.  Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[32]  Ian Watson,et al.  The Manchester prototype dataflow computer , 1985, CACM.

[33]  Arvind,et al.  Future Scientific Programming on Parallel Machines , 1988, J. Parallel Distributed Comput..

[34]  Lubomir F. Bic A Process-Oriented Model for Efficient Execution of Dataflow Programs , 1990, J. Parallel Distributed Comput..

[35]  Kenji Nishida,et al.  An Architecture of a Data Flow Machine and Its Evaluation , 1984, COMPCON.

[36]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[37]  David E. Culler,et al.  Assessing the Benefits of Fine- Grain Parallelism in Dataflow Programs , 1988 .

[38]  David E. Culler,et al.  Dataflow architectures , 1986 .

[39]  Arvind,et al.  Programming Generality and Parallel Computers , 1988 .

[40]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[41]  John Darlington,et al.  ALICE a multi-processor reduction machine for the parallel evaluation CF applicative languages , 1981, FPCA '81.

[42]  Gregory M. Papadopoulos,et al.  Implementation of a general purpose dataflow multiprocessor , 1991 .

[43]  Arvind,et al.  Two Fundamental Issues in Multiprocessing , 1987, Parallel Computing in Science and Engineering.

[44]  Steven K Heller,et al.  Efficient Lazy Data-Structures on a Dataflow Machine , 1989 .

[45]  Jack B. Dennis,et al.  Vim: an experimental multi-user system supporting functional programming , 1984 .

[46]  David E. Culler,et al.  Managing parallelism and resources in scientific dataflow programs , 1989 .

[47]  Jack B. Dennis,et al.  A preliminary architecture for a basic data-flow processor , 1974, ISCA '75.