Assessing the Benefits of Fine- Grain Parallelism in Dataflow Programs

A method for assessing the benefits of fine-grain paral lelism in "real" programs is presented. The method is based on parallelism profiles and speedup curves de rived by executing dataflow graphs on an interpreter under progressively more realistic assumptions about processor resources and communication costs. Even using traditional algorithms, the programs exhibit ample parallelism when parallelism is exposed at all levels. The bias introduced by the language ld and the compiler is examined. A method of estimating speedup through analysis of the ideal parallelism profile is developed, avoiding repeated execution of programs. It is shown that fine-grain parallelism can be used to mask large, unpredictable memory latency and synchronization waits in architectures employing dataflow instruction execu tion mechanisms. Finally, the effects of grouping por tions of dataflow programs, and requiring that the oper ators in a group execute on a single processor, are explored.

[1]  Alan E. Charlesworth,et al.  An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.

[2]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[3]  Robert A. Iannucci,et al.  A dataflow/von Neumann hybrid architecture , 1988 .

[4]  Jack B. Dennis,et al.  First version of a data flow procedure language , 1974, Symposium on Programming.

[5]  Arvind,et al.  Two Fundamental Issues in Multiprocessing , 1987, Parallel Computing in Science and Engineering.

[6]  Arvind,et al.  Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1987, IEEE Trans. Computers.

[7]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[8]  David E. Culler,et al.  Resource requirements of dataflow programs , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[9]  Harry F. Jordan Performance measurements on HEP - a pipelined MIMD computer , 1983, ISCA '83.

[10]  Arvind,et al.  Future Scientific Programming on Parallel Machines , 1988, J. Parallel Distributed Comput..

[11]  Vivek Sarkar,et al.  Partitioning parallel programs for macro-dataflow , 1986, LFP '86.

[12]  Keshav Pingali,et al.  I-structures: Data structures for parallel computing , 1986, Graph Reduction.

[13]  Milos D. Ercegovac,et al.  Performance evaluation of a simulated data-flow computer with low-resolution actors , 1985, J. Parallel Distributed Comput..

[14]  Gregory M. Papadopoulos,et al.  Implementation of a general purpose dataflow multiprocessor , 1991 .