On the feasibility of fixed-length block structured architectures

Scaling contemporary superscalar microarchitectures to higher levels of parallelism in future technologies seems to be impractical due to the increasing complexity. In this paper, we show that a fixed-length block structured instruction set architecture (BSA), is capable of reducing the hardware complexity and is therefore feasible as an alternative architectural paradigm for traditional architectures with large virtual window sizes for future technologies. This is reached through two major interventions. First, statically, grouping instructions from various basic blocks into larger atomic units of work with a fixed length, called blocks, makes fetching easier. Second, a decentralized microarchitecture reduces the processor core logic significantly resulting in higher clock frequencies. The performance evaluation methodology used in this paper both considers IPC (number of useful instructions retired per clock cycle) and clock cycle period. In addition, a broad design space is explored by quantifying the influence of various microarchitectural parameters on overall performance.

[1]  T.H. Lee,et al.  A 600 MHz superscalar RISC microprocessor with out-of-order execution , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[2]  Scott A. Mahlke,et al.  A comparison of full and partial predicated execution support for ILP processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[3]  Norman P. Jouppi,et al.  Register file design considerations in dynamically scheduled processors , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[4]  S. Vajapeyam,et al.  Improving Superscalar Instruction Dispatch And Issue By Exploiting Dynamic Code Sequences , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[5]  Quinn Jacobson,et al.  Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  Koen De Bosschere,et al.  Exploitable levels of ILP in future processors , 1999, J. Syst. Archit..

[7]  Lieven Eeckhout,et al.  Estimating IPC of a block structured instruction set architecture in an early design stage , 1999, PARCO.

[8]  Manoj Franklin,et al.  An empirical study of decentralized ILP execution models , 1998, ASPLOS VIII.

[9]  Yale N. Patt,et al.  Facilitating superscalar processing via a combined static/dynamic register renaming scheme , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[10]  M.J. Flynn,et al.  Deep submicron microprocessor design issues , 1999, IEEE Micro.

[11]  Lieven Eeckhout,et al.  Investigating the implementation of a block structured processor architecture in an early design stage , 1999, Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium.

[12]  Guang R. Gao,et al.  Exploiting short-lived variables in superscalar processors , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[13]  Norman P. Jouppi,et al.  Quantifying the Complexity of Superscalar Processors , 2002 .

[14]  Gurindar S. Sohi,et al.  Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors , 1992, MICRO 1992.

[15]  Pascal Sainrat,et al.  An investigation of the performance of various instruction-issue buffer topologies , 1995, MICRO 28.