Instruction fetch unit for parallel execution of branch instructions

A mechanism to reduce the cost of branches in pipelined processors is presented. This technique is implemented by means of a non-conventional cache (branch target cache) and an early branch detection circuit. Branches are executed by the instruction fetch unit (IFU) in parallel with the other instructions. In this way, the execution time cost for many branches can be effectively reduced to zero. In order to obtain the IFU design parameters, the mechanism is evaluated by means of an analytical model. Simulation results show the effectiveness of this technique.

[1]  Emmanuel Katevenis,et al.  Reduced instruction set computer architectures for VLSI , 1984 .

[2]  A. Olivé,et al.  Identifying influencing factors on Branch Target Cache Memory performance , 1985 .

[3]  Jordi Cortadella,et al.  A mechanism for reducing the cost of branches in RISC architectures , 1988, Microprocess. Microprogramming.

[4]  José M. Llaberia Griñó,et al.  Keeping control transfer instructions out of the pipeline in architectures without condition codes , 1987 .

[5]  James R. Larus,et al.  Design Decisions in SPUR , 1986, Computer.

[6]  Norman P. Jouppi,et al.  Hardware/software tradeoffs for increased performance , 1982, ASPLOS I.

[7]  Henry M. Levy,et al.  An evaluation of branch architectures , 1987, ISCA '87.

[8]  S. McFarling,et al.  Reducing the cost of branches , 1986, ISCA '86.

[9]  David W. Anderson,et al.  The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .

[10]  Alan Jay Smith,et al.  Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.

[11]  Thomas R. Gross,et al.  Optimizing delayed branches , 1982, MICRO 15.

[12]  Jordi Cortadella,et al.  Designing a branch target buffer for executing branches with zero time cost in a RISC processor , 1988 .

[13]  Mark Horowitz,et al.  Architectural tradeoffs in the design of MIPS-X , 1987, ISCA '87.

[14]  Roland N. Ibbett,et al.  The MU5 Computer System , 1979 .

[15]  Andrew R. Pleszkun,et al.  WISQ: a restartable architecture using queues , 1987, ISCA '87.

[16]  Norman P. Jouppi,et al.  MIPS: a VLSI processor architecture , 1981 .

[17]  David R. Ditzel,et al.  Branch folding in the CRISP microprocessor: reducing branch delay to zero , 1987, ISCA '87.