Exploiting virtual registers to reduce pressure on real registers

It is well known that a large fraction of variables are short-lived. This paper proposes a novel approach to exploiting this fact to reduce the register pressure for pipelined processors with data-forwarding network. The idea is that the compiler can allocate virtual registers (i.e., place holders to identify dependences among instructions) to short-lived variables, which do not need to be stored to physical storage locations. As a result, real registers (i.e., physically existed registers) can be reserved for long-lived variables for mitigating the register pressure and decreasing the register spills, leading to performance improvement. In this paper, we develop the architectural and compiler support for exploiting virtual registers for statically scheduled processors. Our experimental results show that virtual registers are very effective at reducing the register spills, which, in many cases, can achieve the performance close to the processor with twice number of real registers. Our results also indicate that, for some applications, using 24 virtual, in addition to 8 real registers, can attain even higher performance than that of 16 real without any virtual registers.

[1]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[2]  Mateo Valero,et al.  Late allocation and early release of physical registers , 2004, IEEE Transactions on Computers.

[3]  Vittorio Zaccaria,et al.  Exploiting data forwarding to reduce the power budget of VLIW embedded processors , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[4]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[5]  Norman P. Jouppi,et al.  The multicluster architecture: reducing cycle time through partitioning , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  Krste Asanovic,et al.  Banked multiported register files for high-frequency superscalar microprocessors , 2003, ISCA '03.

[7]  Ranjani Parthasarathi,et al.  Compiler assisted Data Forwarding in VLIW / EPIC architectures , 2002 .

[8]  Trevor N. Mudge,et al.  How to fake 1000 registers , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[9]  Trevor N. Mudge,et al.  Integrating superscalar processor components to implement register caching , 2001, ICS '01.

[10]  Mateo Valero,et al.  Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[11]  Ken Kennedy,et al.  Improving register allocation for subscripted variables , 1990, PLDI '90.

[12]  Guang R. Gao,et al.  An investigation of the performance of various instruction-issue buffer topologies , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[13]  Victor V. Zyuban,et al.  Inherently Lower-Power High-Performance Superscalar Architectures , 2001, IEEE Trans. Computers.

[14]  Kevin W. Rudd,et al.  Efficient Exception Handling Techniques for High-Performance Processor Architectures , 1997 .

[15]  Scott Mahlke,et al.  Scalar program performance on multiple-instruction-issue processors with a limited number of registers , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[16]  Neil C. Wilhelm,et al.  Caching processor general registers , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[17]  Mateo Valero,et al.  Virtual-physical registers , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[18]  David A. Patterson,et al.  Computer Organization and Design, Fourth Edition, Fourth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) , 2008 .

[19]  Hans Mulder,et al.  Introducing the IA-64 Architecture , 2000, IEEE Micro.

[20]  Hansoo Kim,et al.  Region-based Register Allocation for EPIC Architectures , 2000 .

[21]  Rajeev Balasubramonian,et al.  Reducing the complexity of the register file in dynamic superscalar processors , 2001, MICRO.

[22]  Norman P. Jouppi,et al.  Register file design considerations in dynamically scheduled processors , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[23]  Margaret Martonosi,et al.  Reducing Register File Power Consumption by Exploiting Value Lifetime Characteristics , 2000 .

[24]  B. Ramakrishna Rau,et al.  EPIC: An Architecture for Instruction-Level Parallel Processors , 2000 .

[25]  Andrew R. Pleszkun,et al.  Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.

[26]  John L. Hennessy,et al.  Register allocation by priority-based coloring , 1984, SIGPLAN '84.