Design of cache memories for dataflow architecture

Abstract The recent advance in dataflow processing — to combine the dataflow paradigm with the control-flow paradigm — has brought out many new challenging issues. This hybrid organization has made it possible to study and adapt familiar control-flow concepts such as cache memories within the framework of the dataflow architecture. The concept of cache memory has proven its effectiveness in the von Neumann architecture due to the spatial and temporal localities which govern the organization of the conventional programming execution. A dataflow paradigm, does not informally support locality, since the execution sequence is enforced only by the availability of operands. However, dataflow programs can be reordered based on various criteria to enhance the locality of instruction references. This can be achieved by: (i) careful partitioning of a dataflow program into vertical layers of data dependent instructions; and (ii) proper distribution and allocation of the recurrence portions of the dataflow program. Enhancing the locality of data references in the dataflow architecture is a more challenging problem. This paper studies the design of instruction, data (operand), and I-Structure cache memories using the Explicit Token Store (ETS) model of dataflow system. The performance results obtained using various benchmark programs are presented and analyzed.

[1]  Mario Tokoro,et al.  On the working set concept for data-flow machines , 1983, ISCA '83.

[2]  Sharilyn A. Thoreson,et al.  A Feasibility Study of a Memory Hierarchy in a Data Flow Environment , 1985, ICPP.

[3]  B. Lee,et al.  Program partitioning for multithreaded dataflow computers , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[4]  David E. Culler,et al.  Monsoon: an explicit token-store architecture , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[5]  Veljko Milutinovic,et al.  The Cache Coherence Problem in Shared-Memory Multiprocessors: Software Solutions , 1996 .

[6]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[7]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[8]  Ali R. Hurson,et al.  Dataflow architectures and multithreading , 1994, Computer.

[9]  Mateo Valero,et al.  Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[10]  Krishna M. Kavi,et al.  Design of cache memories for multi-threaded dataflow architecture , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[11]  David E. Culler,et al.  Dataflow architectures , 1986 .

[12]  Masaru Takesue A unified resource management and execution control mechanism for data flow machines , 1987, ISCA '87.

[13]  David C. Cann,et al.  A Report on the Sisal Language Project , 1990, J. Parallel Distributed Comput..

[14]  Krishna M. Kavi,et al.  Cache design for an Explicit Token Store data flow architecture , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.

[15]  Ken Kennedy,et al.  Software methods for improvement of cache performance on supercomputer applications , 1989 .

[16]  Robert A. Iannucci Toward a dataflow/von Neumann hybrid architecture , 1988, ISCA '88.

[17]  Steven A. Przybylski,et al.  Cache and memory hierarchy design: a performance-directed approach , 1990 .

[18]  J. K. Archibald The cache coherence problem in shared-memory multiprocessors , 1987 .

[19]  Jean-Loup Baer,et al.  Proceedings of the 39th Annual International Symposium on Computer Architecture , 1983, International Symposium on Computer Architecture.

[20]  Gregory M. Papadopoulos,et al.  Implementation of a general purpose dataflow multiprocessor , 1991 .

[21]  Alan Jay Smith,et al.  Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.

[22]  David A. Wood,et al.  Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.

[23]  Derek Chiou,et al.  Performance Studies of Id on the Monsoon Dataflow System , 1993, J. Parallel Distributed Comput..

[24]  Seth Copen Goldstein,et al.  TAM - A Compiler Controlled Threaded Abstract Machine , 1993, J. Parallel Distributed Comput..

[25]  R. S. Nikhil Can dataflow subsume von Neumann computing? , 1989, ISCA '89.