Fast Identification of Custom Instructions for Extensible Processors

This paper proposes a fast algorithm to enumerate all convex subgraphs that satisfy the I/O constraints from the dataflow graph (DFG) of a basic block. The algorithm can be tuned to determine all subgraphs or only those connected subgraphs. This allows a choice between better instruction-set extension (ISE) and faster design space exploration. The algorithm uses a grading method to identify the next node for inclusion into a subgraph. If the selected node is included, other related nodes are included as well, thus ensuring that the resultant subgraph is always convex and at the same time, reducing the problem size by a block of nodes. If the selected node is not included, the DFG will be split into smaller DFGs, thus reducing also the problem size. With this as base, the algorithm employs a simple but efficient method to prune the invalid subgraphs that violate the I/O constraints. Results show that for relatively small DFGs with small exploration space, the new algorithm has similar runtimes to that of existing algorithms. However, for larger DFGs with much larger exploration space and with multiple input and output constraints, the runtime improvement can be orders of magnitude better than that of existing algorithms. The new algorithm can be used to quickly identify custom instructions for ISE of embedded processors

[1]  Jan Hoogerbrugge,et al.  ConCISe: a compiler-driven CPLD-based instruction set accelerator , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[2]  Scott A. Mahlke,et al.  Processor Acceleration Through Automated Instruction Set Customization , 2003, MICRO.

[3]  Nikil D. Dutt,et al.  ISEGEN: generation of high-quality instruction set extensions by iterative improvement , 2005, Design, Automation and Test in Europe.

[4]  Gerry Kane,et al.  MIPS RISC Architecture , 1987 .

[5]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[6]  Douglas L. Maskell,et al.  M2E: A Multiple-Input, Multiple-Output Function Extension for RISC-Based Extensible Processors , 2006, ARCS.

[7]  Tulika Mitra,et al.  Characterizing embedded applications for instruction-set extensible processors , 2004, Proceedings. 41st Design Automation Conference, 2004..

[8]  Andreas Moshovos,et al.  CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit , 2000, ISCA '00.

[9]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[10]  Darin Petkov,et al.  Automatic generation of application specific processors , 2003, CASES '03.

[11]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[12]  Paolo Ienne,et al.  On the Limits of Processor Specialisation by Mapping Dataflow Sections on Ad-hoc Functional Units , 2001 .

[13]  Jason Cong,et al.  Instruction set extension with shadow registers for configurable processors , 2005, FPGA '05.

[14]  Tulika Mitra,et al.  Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.

[15]  Nikil D. Dutt,et al.  Introduction of local memory elements in instruction set extensions , 2004, Proceedings. 41st Design Automation Conference, 2004..

[16]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.