Control of loop parallelism in multithreaded code

Due to the large amount of potential parallelism, resource management is a critical issue in multithreaded architectures. The challenge in code generation is to control the parallelism without reducing the machines ability to exploit it. Controlled parallelism reduces idle time, communication, and delay caused by synchronization. At the same time it increases the potential for exploitation of program *data structure* locality. In this paper we present and evaluate two methods, slicing and chunking, to control program parallelism. We present the compilation strategy and evaluate its e ectiveness in terms of performance characteristics such as run time and matching store

[1]  Toshitsugu Yuba,et al.  An Architecture Of A Dataflow Single Chip Processor , 1989, The 16th Annual International Symposium on Computer Architecture.

[2]  Walid A. Najjar,et al.  An Evaluation of Optimized Threaded Code Generation , 1994, IFIP PACT.

[3]  Walid A. Najjar,et al.  Generation and quantitative evaluation of dataflow clusters , 1993, FPCA '93.

[4]  D. E. Culler,et al.  RESOURCE MANAGEMENT FOR THE TAGGED TOKEN DATAFLOW ARCHITECTURE , 1985 .

[5]  John Glauert,et al.  SISAL: streams and iteration in a single assignment language. Language reference manual, Version 1. 2. Revision 1 , 1985 .

[6]  Burton J. Smith Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.

[7]  David E. Culler,et al.  Monsoon: an explicit token-store architecture , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[8]  Arvind,et al.  T: A Multithreaded Massively Parallel Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[9]  Walid A. Najjar,et al.  Top-Down Thread Generation for Sisal , 1993 .

[10]  Allan Porterfield,et al.  The Tera computer system , 1990 .

[11]  Walid A. Najjar,et al.  An analysis of loop latency in dataflow execution , 1992, ISCA '92.

[12]  John Sargeant,et al.  Control of parallelism in the Manchester Dataflow Machine , 1987, FPCA.

[13]  William J. Dally,et al.  The message-driven processor , 1992 .

[14]  Walid A. Najjar,et al.  An evaluation of bottom-up and top-down thread generation techniques , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.

[15]  David E. Culler,et al.  Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine , 1991, ASPLOS IV.

[16]  Walid A. Najjar,et al.  An evaluation of bottom-up and top-down thread generation techniques , 1993, MICRO 1993.

[17]  David F. Snelling,et al.  The Design and Analysis of a Stateless Data-Flow Architecture , 1993 .