Compiling for automatically generated instruction set extensions

The automatic generation of instruction set extensions (ISEs) to provide application-specific acceleration for embedded processors has been a productive area of research in recent years. The use of automatic algorithms, however, results in instructions that are radically different from those found in conventional Isas. This has resulted in a gap between the hardware's capabilities and the compiler's ability to exploit them. This paper proposes an innovative high-level compiler pass that uses subgraph isomorphism checking to exploit these complex instructions. Our extended code generator also enables the reuse of Ises designed for one application in another, which may be a newer version of the same application or a different one from the same domain. Operating in a separate pass permits computationally expensive techniques to be applied that are uniquely suited for mapping complex instructions, but unsuitable for conventional instruction selection. We demonstrate that this targeted use of an expensive algorithm effectively controls overall compilation time. The existing, mature, compiler back-end can then handle the remainder of the compilation. Instructions are automatically produced for 179 benchmarks, resulting in a total of 1965 unique instructions. The high-level pass integrated into the open-source Gcc compiler is able to use the instructions produced for each benchmark to obtain an average speed-up of 1.26 for the Encore extensible processor.

[1]  Henk Corporaal,et al.  Designing domain-specific processors , 2001, CODES '01.

[2]  Rainer Leupers,et al.  Customizable Embedded Processors: Design Technologies and Applications , 2006 .

[3]  FoggiaPasquale,et al.  A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs , 2004 .

[4]  Rainer Leupers,et al.  Fast graph‐based instruction selection for multi‐output instructions , 2011, Softw. Pract. Exp..

[5]  Rainer Leupers,et al.  Instruction selection for embedded DSPs with complex instructions , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[6]  Alan Murray,et al.  An End-to-End Design Flow for Automated Instruction Set Extension and Complex Instruction Selection Based on GCC , 2009 .

[7]  Koen Bertels,et al.  The Instruction-Set Extension Problem: A Survey , 2008, TRETS.

[8]  Sharad Malik,et al.  From ASIC to ASIP: the next design discontinuity , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[9]  M. Anton Ertl Optimal code selection in DAGs , 1999, POPL '99.

[10]  Albrecht Kadlec,et al.  Generalized instruction selection using SSA-graphs , 2008, LCTES '08.

[11]  Igor Böhm,et al.  Cycle-accurate performance modelling in an ultra-fast just-in-time dynamic binary translation instruction set simulator , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[12]  Rainer Leupers,et al.  Graph-based code selection techniques for embedded processors , 2000, TODE.

[13]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Rainer Leupers,et al.  A code-generator generator for Multi-Output Instructions , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[15]  Nikil D. Dutt,et al.  ISEGEN: an iterative improvement-based ISE generation technique for fast customization of processors , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[16]  T. C. May,et al.  Instruction-set matching and selection for DSP and ASIP code generation , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[17]  Hedley Francis,et al.  ARM DSP-Enhanced Exten-sions , 2001 .

[18]  Scott A. Mahlke,et al.  Scalable subgraph mapping for acyclic computation accelerators , 2006, CASES '06.