Diierential Multithreading: Recapturing Pipeline Stall Cycles and Enhancing Throughput in Small-scale Embedded Microprocessors

This paper presents Diierential Multithreading (dMT) as an inexpensive way to achieve high through-put from a single-issue architecture. dMT switches among multiple instruction streams in response to pipeline stall conditions but saves in-ight instructions, thus squashing pipeline bubbles and ensuring maximal utilization of a single pipeline. dMT uses auxiliary pipeline registers to save the state of in-ight but stalled instructions. This squashes bubbles that would otherwise arise from data hazards, branch delays, and cache misses. This paper describes the pipeline organization necessary to support dMT, explains the advantage of shared-pipeline multithreading, and presents preliminary results which suggest that dMT can substantially increase processor utilization.

[1]  Allan Porterfield,et al.  The Tera computer system , 1990, ICS '90.

[2]  Robert Dewar,et al.  Microprocessors: A Programmer's View , 1990 .

[3]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[4]  James E. Smith,et al.  A study of branch prediction strategies , 1981, ISCA '98.

[5]  Todd C. Mowry,et al.  Software-controlled multithreading using informing memory operations , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[6]  Yale N. Patt,et al.  Simultaneous subordinate microthreading (SSMT) , 1999, ISCA.

[7]  Gary Lauterbach,et al.  UltraSPARC-III: designing third-generation 64-bit performance , 1999, IEEE Micro.

[8]  D. Tullsen,et al.  ILP versus TLP on SMT , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[9]  Margaret Martonosi,et al.  Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques , 1999, IEEE Trans. Computers.

[10]  Anant Agarwal,et al.  APRIL: a processor architecture for multiprocessing , 1990, ISCA '90.