FISH: Fast Instruction SyntHesis for Custom Processors

This paper presents Fast Instruction SyntHesis (FISH), a system that supports automatic generation of custom instruction processors from high-level application descriptions to enable fast design space exploration. FISH is based on novel methods for automatically adapting the instruction set to match an application in a high-level language such as C or C++. FISH identifies custom instruction candidates using two approaches: 1) by enumerating maximal convex subgraphs of application data flow graphs and 2) by integer linear programming (ILP). The experiments, involving ten multimedia and cryptography benchmarks, show that our contributed algorithms are the fastest among the state-of-the-art techniques. In most cases, enumeration takes only milliseconds to execute. The longest enumeration run-time observed is less than six seconds. ILP is usually slower than enumeration, but provides us with a complementary solution technique. Both enumeration and ILP allow the use of multiple different merit functions in the evaluation of data-flow subgraphs. The experiments demonstrate that, using only modest additional hardware resources, up to 30-fold performance improvement can be obtained with respect to a single-issue base processor.

[1]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[2]  Douglas L. Maskell,et al.  Fast Identification of Custom Instructions for Extensible Processors , 2007, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Luca Breveglieri,et al.  Efficient AES implementations for ARM based platforms , 2004, SAC '04.

[4]  Microsystems Sun,et al.  Jini^ Architecture Specification Version 2.0 , 2003 .

[5]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  Cid C. de Souza,et al.  Efficient datapath merging for partially reconfigurable architectures , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Kubilay Atasu,et al.  Hardware/software partitioning for custom instruction processors (Özelleştirilebilir komut kümeli işlemciler için yazılım/donanım bölüştürmesi) , 2007 .

[8]  Günhan Dündar,et al.  An integer linear programming approach for identifying instruction-set extensions , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[9]  Kingshuk Karuri,et al.  Design Space Exploration of Partially Re-configurable Embedded Processors , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[10]  Paolo Ienne,et al.  Rethinking custom ISE identification: a new processor-agnostic method , 2007, CASES '07.

[11]  Srivaths Ravi,et al.  Custom-instruction synthesis for extensible-processor platforms , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Sri Parameswaran,et al.  Design Methodology for Pipelined Heterogeneous Multiprocessor System , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[13]  Wayne Luk,et al.  Optimizing Instruction-set Extensible Processors under Data Bandwidth Constraints , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Grant Martin,et al.  Recent Developments in Configurable and Extensible Processors , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).

[15]  Tulika Mitra,et al.  Disjoint Pattern Enumeration for Custom Instructions Identification , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[16]  Frank Vahid,et al.  Design and implementation of a MicroBlaze-based warp processor , 2009, TECS.

[17]  Tulika Mitra,et al.  Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.

[18]  Kingshuk Karuri,et al.  A design flow for configurable embedded processors based on optimized instruction set extension synthesis , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[19]  Gregory Gutin,et al.  Better Than Optimal: Fast Identification of Custom Instruction Candidates , 2009, 2009 International Conference on Computational Science and Engineering.

[20]  Thambipillai Srikanthan,et al.  Rapid design of area-efficient custom instructions for reconfigurable embedded processing , 2009, J. Syst. Archit..

[21]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[22]  Wayne Luk,et al.  Fast custom instruction identification by convex subgraph enumeration , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[23]  Anshul Kumar,et al.  Application Specific Datapath Extension with Distributed I/O Functional Units , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).

[24]  Srivaths Ravi,et al.  A hybrid energy-estimation technique for extensible processors , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Frank Vahid,et al.  Dynamic FPGA routing for just-in-time FPGA compilation , 2004, Proceedings. 41st Design Automation Conference, 2004..

[26]  Tao Li,et al.  Fast enumeration of maximal valid subgraphs for custom-instruction identification , 2009, CASES '09.

[27]  Paolo Ienne,et al.  Fast, quasi-optimal, and pipelined instruction-set extensions , 2008, 2008 Asia and South Pacific Design Automation Conference.

[28]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[29]  Gregory Gutin,et al.  An algorithm for finding input-output constrained convex sets in an acyclic digraph , 2012, J. Discrete Algorithms.

[30]  Jörg Henkel,et al.  MINCE: matching instructions using combinational equivalence for extensible processor , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[31]  Paolo Ienne,et al.  Exploiting pipelining to relax register-file port constraints of instruction-set extensions , 2005, CASES '05.

[32]  Wayne Luk,et al.  CHIPS: Custom Hardware Instruction Processor Synthesis , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[33]  Nikil D. Dutt,et al.  ISEGEN: generation of high-quality instruction set extensions by iterative improvement , 2005, Design, Automation and Test in Europe.

[34]  Jason Cong,et al.  Application-specific instruction generation for configurable processor architectures , 2004, FPGA '04.

[35]  Scott A. Mahlke,et al.  An architecture framework for transparent instruction set customization in embedded processors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[36]  Nikil D. Dutt,et al.  Introduction of local memory elements in instruction set extensions , 2004, Proceedings. 41st Design Automation Conference, 2004..

[37]  Jörg Henkel,et al.  Closing the SoC Design Gap , 2003, Computer.

[38]  Muhammad Shafique,et al.  RISPP: Rotating Instruction Set Processing Platform , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[39]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[40]  Scott A. Mahlke,et al.  Processor Acceleration Through Automated Instruction Set Customization , 2003, MICRO.

[41]  Ricardo E. Gonzalez,et al.  Xtensa: A Configurable and Extensible Processor , 2000, IEEE Micro.

[42]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[43]  Michael Gschwind,et al.  Instruction set selection for ASIP design , 1999, Proceedings of the Seventh International Workshop on Hardware/Software Codesign (CODES'99) (IEEE Cat. No.99TH8450).

[44]  Wayne Luk,et al.  Run-Time Adaptive Flexible Instruction Processors , 2002, FPL.

[45]  Srivaths Ravi,et al.  Application-specific heterogeneous multiprocessor synthesis using extensible processors , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[46]  Majid Sarrafzadeh,et al.  Instruction generation for hybrid reconfigurable systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[47]  Hugo De Man,et al.  Instruction set definition and instruction selection for ASIPs , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[48]  B. R. Rau,et al.  HPL-PD Architecture Specification:Version 1.1 , 2000 .