Potential of Using a Reconfigurable System on a Superscalar Core for ILP Improvements

As technology scaling reduces pace and energy efficiency becomes a new important design constraint, superscalar processor designs seem to be reaching their performance limits under the area and power constraints. As a result, new architectural paradigms have to be developed. This work proposes a new architecture for x86 processors, based on a traditional superscalar design coupled to a reconfigurable array. The architecture explores the fact that few basic blocks are responsible for most of the instructions that execute on the processor, and performs a mapping of these basic blocks into a configuration for the reconfigurable array. The configuration encodes the dependencies between the instructions, so that when a loop is executed multiple times, fetch, decode and dependency checks on the instructions are bypassed, thus improving instruction throughput. Our study of the potential of the architecture shows that performance gains of up to 2.5 times with respect to a traditional superscalar can be presented.

[1]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[2]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[3]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[4]  Eric Rotenberg,et al.  Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[5]  Luigi Carro,et al.  Dynamic Reconfigurable Architectures and Transparent Optimization Techniques - Automatic Acceleration of Software Execution , 2010 .

[6]  Antonio González,et al.  Energy-effective issue logic , 2001, ISCA 2001.

[7]  Erik R. Altman,et al.  Welcome to the Opportunities of Binary Translation , 2000, Computer.

[8]  David W. Wall,et al.  Limits of instruction-level parallelism , 1991, ASPLOS IV.

[9]  Luigi Carro,et al.  Adaptable Embedded Systems , 2012 .

[10]  Luigi Carro,et al.  Transparent Reconfigurable Acceleration for Heterogeneous Embedded Applications , 2008, 2008 Design, Automation and Test in Europe.

[11]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[12]  Luigi Carro,et al.  Towards a multiple-ISA embedded system , 2013, J. Syst. Archit..

[13]  Luigi Carro,et al.  A transparent and adaptive reconfigurable system , 2014, Microprocess. Microsystems.

[14]  Frank Vahid,et al.  Warp Processors , 2006, ACM Trans. Design Autom. Electr. Syst..

[15]  David R. Kaeli,et al.  Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  David J. Sager,et al.  The microarchitecture of the Pentium 4 processor , 2001 .

[17]  M.J. Flynn,et al.  Microprocessor design issues: thoughts on the road ahead , 2005, IEEE Micro.

[18]  Scott A. Mahlke,et al.  Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[19]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .