论文信息 - RingScalar: A Complexity-Effective Out-of-Order Superscalar Microarchitecture

RingScalar: A Complexity-Effective Out-of-Order Superscalar Microarchitecture

RingScalar is a complexity-effective microarchitecture for out-of-order superscalar processors, that reduces the area, latency, and power of all major structures in the instruction flow. The design divides an -way superscalar into columns connected in a unidirectional ring, where each column contains a portion of the instruction window, a bank of the register file, and an ALU. The design exploits the fact that most decoded instructions are waiting on just one operand to use only a single tag per issue window entry, and to restrict instruction wakeup and value bypass to only communicate with the neighboring column. Detailed simulations of fourissue single-threaded machines running SPECint2000 show that RingScalar has IPC only 13% lower than an idealized superscalar, while providing large reductions in area, power, and circuit latency.

Krste Asanovic | Jessica H. Tseng

[1] Nader Bagherzadeh,et al. A scalable register file architecture for dynamically scheduled processors , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[2] T. N. Vijaykumar,et al. Reducing register ports for higher speed and lower energy , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[3] Mikko H. Lipasti,et al. Half-price architecture , 2003, ISCA '03.

[4] R. M. Tomasulo,et al. An efficient algorithm for exploiting multiple arithmetic units , 1995 .

[5] Krste Asanovic,et al. A speculative control scheme for an energy-efficient banked register file , 2005, IEEE Transactions on Computers.

[6] T. Austin,et al. Cyclone: a broadcast-free dynamic instruction scheduler with selective replay , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[7] Krste Asanovic,et al. Banked multiported register files for high-frequency superscalar microprocessors , 2003, ISCA '03.

[8] Rajeev Balasubramonian,et al. Reducing the complexity of the register file in dynamic superscalar processors , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[9] Jaume Abella,et al. Inherently workload-balanced clustered microarchitecture , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[10] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[11] Dean M. Tullsen,et al. Fellowship - Simulation And Modeling Of A Simultaneous Multithreading Processor , 1996, Int. CMG Conference.

[12] Manoj Franklin,et al. PEWs: a decentralized dynamic scheduler for ILP processing , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[13] Mateo Valero,et al. Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[14] Todd M. Austin,et al. Efficient dynamic scheduling through tag elimination , 2002, ISCA.

[15] Steven K. Reinhardt,et al. A scalable instruction queue design using dependence chains , 2002, ISCA.

[16] Norman P. Jouppi,et al. The multicluster architecture: reducing cycle time through partitioning , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[17] Richard E. Kessler,et al. The Alpha 21264 microprocessor architecture , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[18] Quinn Jacobson,et al. Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[19] Manoj Franklin,et al. An empirical study of decentralized ILP execution models , 1998, ASPLOS VIII.

[20] Pradip Bose,et al. Tradeoffs in power-efficient issue queue design , 2002, ISLPED '02.

[21] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[22] William J. Dally,et al. Register organization for media processing , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).