Effective Dispatching for Simultaneous Multi-Threading (SMT) Processors by Capping Per-Thread Resource Utilization

Simultaneous multithreading (SMT) provides a technique to improve resource utilization by sharing key datapath components among multiple independent threads. When critical resources are shared by multiple threads, effective use of these resources proves to be the most important factor in fully exploiting the system potential. Allowing any of the threads to overwhelm these shared resources not only leads to unfair thread processing but may also result in severely degraded overall performance. How to prevent idling threads from clogging the critical resources in the pipeline becomes a must in sustaining system performance. In this paper, we show that, by simply setting a cap on the number of the critical Issue Queue (IQ) entries each thread is allowed to occupy, the system performance is easily enhanced by a significant margin. An even more pronounced advantage of the proposed technique over other advanced dispatching algorithms is that the performance gain is obtained with very minimal additional hardware required.

[1]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[2]  Joseph J. Sharkey,et al.  Exploiting Operand Availability for Efficient Simultaneous Multithreading , 2007, IEEE Transactions on Computers.

[3]  Joseph J. Sharkey,et al.  Adaptive reorder buffers for SMT processors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  Hui Wang,et al.  Optimizing Instruction Scheduling through Combined In-Order and O-O-O Execution in SMT Processors , 2009, IEEE Transactions on Parallel and Distributed Systems.

[5]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[6]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.

[7]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[8]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[9]  Joseph J. Sharkey,et al.  Efficient instruction schedulers for SMT processors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..