Design and Evaluation of an Instruction Cache for Reducing the Cost of Branches

Abstract This paper focuses on the design and evaluation of an instruction cache memory for a pipelined processor. The performance of the instruction cache is crucial for the efficiency of the mechanism used by the processor to reduce the cost of branches. This mechanism is based on the use of prefetching techniques. The memory system is organized around a Branch Target Instruction Memory, that is, an instruction cache whose mapping unit corresponds to the set of instructions between two consecutive taken branches. The aim of this work is to find the optimum cache configuration, given the available area and the external memory latency. The parameters investigated are the prefetch strategy, the number of lines of cache and the size of each line.

[1]  Harvey G. Cragon,et al.  Branch strategy taxonomy and performance models , 1991, IEEE computer society press monograph.

[2]  Emmanuel Katevenis,et al.  Reduced instruction set computer architectures for VLSI , 1984 .

[3]  Gregory F. Grohoski,et al.  Machine Organization of the IBM RISC System/6000 Processor , 1990, IBM J. Res. Dev..

[4]  Glenn Hinton 80960-next generation , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[5]  D. J. Lalja,et al.  Reducing the branch penalty in pipelined processors , 1988, Computer.

[6]  S. McFarling,et al.  Reducing the cost of branches , 1986, ISCA '86.

[7]  Peter M. Kogge,et al.  The Architecture of Pipelined Computers , 1981 .

[8]  Mark Horowitz,et al.  Architectural tradeoffs in the design of MIPS-X , 1987, ISCA '87.

[9]  Mark D. Hill,et al.  A case for direct-mapped caches , 1988, Computer.

[10]  Thomas R. Gross,et al.  Optimizing delayed branches , 1982, MICRO 15.

[11]  Mike Johnson System Considerations in the Design of the Am29000 , 1987, IEEE Micro.

[12]  Eduardo Sanchez,et al.  A General Heap Processor , 1987, IEEE Micro.

[13]  V. M. Milutinovic,et al.  RISC principles, architecture, and design , 1989 .

[14]  José María Llabería,et al.  Reducing Branch Delay to Zero in Pipelined Processors , 1993, IEEE Trans. Computers.

[15]  José María Llabería,et al.  Instruction fetch unit for parallel execution of branch instructions , 1989, ICS '89.