Processor Allocation in a Multi-ring Dataflow Machine

Abstract The performance of a multiprocessor architecture is determined both by the way the program is partitioned into processes and by the way these processes are allocated to different processors. In the fine-grain dataflow model, where each process consists of a single instruction, decomposition of a program into processes is achieved automatically by compilation. This paper investigates the effectiveness of fine-grain decomposition in the context of the prototype dataflow machine now in operation at the University of Manchester. The current machine is a uniprocessor, known as the Single-Ring Dataflow Machine, comprising a single processing element which contains several units connected together in a pipelined ring. A Multi-ring Dataflow Machine (MDM) containing several such processing elements connected together via an interprocessor switching network, is currently under investigation. The paper describes a method of allocating dataflow instructions to processing elements in the MDM, and examines the influence of this method on selection of a switching network. Results obtained from simulation of the MDM are presented. They show that programs are executed efficiently when their parallelism is matched to the parallelism of the machine hardware.

[1]  B J Smith,et al.  A pipelined, shared resource MIMD computer , 1986 .

[2]  Cauligi S. Raghavendra,et al.  The Gamma network: A multiprocessor interconnection network with redundant paths , 1982, ISCA 1982.

[3]  James E. Rumbaugh,et al.  A Data Flow Multiprocessor , 1977, IEEE Transactions on Computers.

[4]  John R. Gurd,et al.  Hardware and Software Enhancement of the Manchester Dataflow Machine , 1985, COMPCON.

[5]  Janak H. Patel,et al.  Processor-memory interconnections for multiprocessors , 1979, ISCA '79.

[6]  Marc Snir,et al.  The Performance of Multistage Interconnection Networks for Multiprocessors , 1983, IEEE Transactions on Computers.

[7]  Daniel M. Dias,et al.  Analysis and Simulation of Buffered Delta Networks , 1981, IEEE Transactions on Computers.

[8]  Arvind,et al.  Some Relationships Between Asynchronous Interpreters of a Dataflow Language , 1977, Formal Description of Programming Concepts.

[9]  J. R. Gurd,et al.  A scalable dataflow structure store , 1986, ISCA 1986.

[10]  Marshall C. Pease,et al.  The Indirect Binary n-Cube Microprocessor Array , 1977, IEEE Transactions on Computers.

[11]  David Nassimi A self routing Benes network , 1980, ISCA '80.

[12]  Jean-Luc Gaudiot,et al.  A distributed VLSI architecture for efficient signal and data processing , 1985, IEEE Transactions on Computers.

[13]  Arvind,et al.  A critique of multiprocessing von Neumann style , 1983, ISCA '83.

[14]  Kim P. Gostelow,et al.  Performance of a Simulated Dataflow Computer , 1980, IEEE Transactions on Computers.

[15]  Jack B. Dennis,et al.  First version of a data flow procedure language , 1974, Symposium on Programming.

[16]  Toshitsugu Yuba,et al.  SIGMA-1: A dataflow computer for scientific computations , 1985 .

[17]  L. J. Caluwaerts,et al.  A data flow architecture with a paged memory system , 1982, ISCA 1982.

[18]  Ian Watson,et al.  Preliminary Evaluation of a Prototype Dataflow Computer , 1983, IFIP Congress.

[19]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[20]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[21]  Ian Watson,et al.  The Manchester prototype dataflow computer , 1985, CACM.

[22]  Jack B. Dennis,et al.  Building blocks for data flow prototypes , 1980, ISCA '80.

[23]  John Glauert,et al.  SISAL: streams and iteration in a single assignment language. Language reference manual, Version 1. 2. Revision 1 , 1985 .

[24]  John Sargeant,et al.  Stored data structures on the Manchester dataflow machine , 1986, ISCA 1986.

[25]  Charles L. Seitz,et al.  The cosmic cube , 1985, CACM.