Analysis of multithreaded architectures for parallel computing

Multithreading has been proposed as an architectural strategy for tolerating latency on multiprocessors and, through limited empirical studies shows to offer promise. This paper develops an analytical models of multi threaded processor behavior based on a small set of architectural and program parameters. The model gives rise to a large Markov chain, which is solved to obtain a formula for processor in terms of the number of threads). transition, and saturation efficiency depends only on the remorse reference rate and switch case. Formulas for regime boundaries are derived. The model is embellished to reflect cache degradation due to multithreading, using an analytical model of cache behavior, demonstrating that returns diminish as the number threads becomes large. Predictions from the embellished model correlate will with published empirical measurements. Prescriptive use of the model under various scenarios indicates that multithreading is effective But the number of useful threads per processor is fairly small.

[1]  David W. Anderson,et al.  The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .

[2]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[3]  Douglas W. Clark,et al.  Cache Performance in the VAX-11/780 , 1983, TOCS.

[4]  J. Goodman Using cache memory to reduce processor-memory traffic , 1983, ISCA '83.

[5]  B J Smith,et al.  A pipelined, shared resource MIMD computer , 1986 .

[6]  Arvind,et al.  Two Fundamental Issues in Multiprocessing , 1987, Parallel Computing in Science and Engineering.

[7]  Burton J. Smith,et al.  A processor architecture for Horizon , 1988, Proceedings. SUPERCOMPUTING '88.

[8]  R.H. Katz,et al.  A characterization of sharing in parallel programs and its application to coherency protocol evaluation , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[9]  Robert A. Iannucci Toward a dataflow/von Neumann hybrid architecture , 1988, ISCA '88.

[10]  David E. Culler,et al.  Assessing the benefits of fine-grain parallelism in dataflow programs , 1988, Proceedings. SUPERCOMPUTING '88.

[11]  Robert H. Halstead,et al.  MASA: a multithreaded processor architecture for parallel symbolic computing , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[12]  David E. Culler,et al.  Assessing the Benefits of Fine- Grain Parallelism in Dataflow Programs , 1988 .

[13]  Dominique Thiébaut,et al.  On the Fractal Dimension of Computer Programs and its Application to the Prediction of the Cache Miss Ratio , 1989, IEEE Trans. Computers.

[14]  R. S. Nikhil Can dataflow subsume von Neumann computing? , 1989, ISCA '89.

[15]  Mark Horowitz,et al.  An analytical cache model , 1989, TOCS.

[16]  A. Gupta,et al.  Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results , 1989, ISCA '89.

[17]  David E. Culler,et al.  An Analytical Solution for a Markov Chain Modeling Multithreaded Execution , 1991 .

[18]  The International Journal of Supercomputer Applications— , 1992 .