论文信息 - Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines

Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines

This paper looks at a combination of two techniques, one of which, using a vector instruction set, has a long history dating back to pipelined vector supercomputers, such as the Cray 1 and its successors. The other technique, multi-threading, is also well understood. The novel approach proposed in this paper combines both vertical and horizontal micro-threading with vector instruction descriptors. It will be shown that a family of threads can represent a vector instruction with dependencies between the instances of that family, the iterations. This technique gives a very low overhead in implementing an n-way loop and is able to tolerate high memory latency. The use of micro-threading to handle dependencies between threads provides the ability to trade-off between instruction level parallelism and loop parallelism. The paper describes the means by which instruction classes may be instanced as independent parallel micro-threads and illustrates the speed-up that may be obtained compared to using a conventional loop.

Chris R. Jesshope

[1] D. Parkinson. Parallel efficiency can be greater than unity , 1986, Parallel Comput..

[2] Charles L. Seitz,et al. The cosmic cube , 1985, CACM.

[3] Israel Koren,et al. Tradeoffs in the Design of Single Chip Multiprocessors , 1994, IFIP PACT.

[4] Chris R. Jesshope,et al. Micro-threading: a new approach to future RISC , 2000, Proceedings 5th Australasian Computer Architecture Conference. ACAC 2000 (Cat. No.PR00512).

[5] DAVID P. HELMBOLD,et al. Modeling Speedup (n) Greater than n , 1990, IEEE Trans. Parallel Distributed Syst..

[6] Todd C. Mowry,et al. The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[7] C. R. Jesshope,et al. Dynamic scheduling in RISC architectures , 1996 .