Extended Instruction Exploration for Multiple-Issue Architectures

In order to satisfy the growing demand for high-performance computing in modern embedded devices, several architectural and microarchitectural enhancements have been implemented in processor architectures. Extended instruction (EI) is often used for architectural enhancement, while issuing multiple instructions is a common approach for microarchitectural enhancement. The impact of combining both of these approaches in the same design is not well understood. While previous studies have shown that EI can potentially improve performance in some applications on certain multiple-issue architectures, the algorithms used to identify EI for multiple-issue architectures yield only limited performance improvement. This is because not all arithmetic operations are suited for EI for multiple-issue architectures. To explore the full potential of EI for multiple-issue architectures, two important factors need to be considered: (1) the execution performance of an application is dominated by critical (located on the critical path) and highly resource-contentious (i.e., having a high probability of being delayed during execution due to hardware resource limitations) operations, and (2) an operation may become critical and/or highly resource contentious after some operations are added to the EI. This article presents an EI exploration algorithm for multiple-issue architectures that focuses on these two factors. Simulation results show that the proposed algorithm outperforms previously published algorithms.

[1]  Shih-Chia Huang,et al.  Instruction Set Extension Generation with Considering Physical Constraints , 2007, HiPEAC.

[2]  Patrick Akl,et al.  Customizing the Datapath and ISA of Soft VLIW Processors , 2007, HiPEAC.

[3]  A. Lodi,et al.  A VLIW processor with reconfigurable instruction set for embedded applications , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[4]  Nikil D. Dutt,et al.  ISEGEN: an iterative improvement-based ISE generation technique for fast customization of processors , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Chris Lattner,et al.  LLVM: AN INFRASTRUCTURE FOR MULTI-STAGE OPTIMIZATION , 2000 .

[6]  Li Shen,et al.  Customizing computation accelerators for extensible multi-issue processors with effective optimization techniques , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[7]  Koen Bertels,et al.  The Instruction-Set Extension Problem: A Survey , 2008, TRETS.

[8]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[9]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[10]  Geoffrey Brown,et al.  ρ-VEX: A reconfigurable and extensible softcore VLIW processor , 2008, 2008 International Conference on Field-Programmable Technology.

[11]  Paolo Ienne,et al.  Automatic Instruction-Set Extensions , 2007 .

[12]  Srivaths Ravi,et al.  Synthesis of custom processors based on extensible platforms , 2002, ICCAD 2002.

[13]  T. C. May,et al.  Instruction-set matching and selection for DSP and ASIP code generation , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[14]  Paolo Ienne,et al.  Automatically Customising VLIW Architectures with Coarse Grained Application-Specific Functional Units , 2004, SCOPES.

[15]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[16]  Giovanni De Micheli,et al.  Automatic instruction set extension and utilization for embedded processors , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[17]  Tulika Mitra,et al.  Disjoint Pattern Enumeration for Custom Instructions Identification , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[18]  Fadi J. Kurdahi,et al.  Partitioning by regularity extraction , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[19]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[20]  Scott A. Mahlke,et al.  Automated custom instruction generation for domain-specific processor acceleration , 2005, IEEE Transactions on Computers.

[21]  Darin Petkov,et al.  Automatic generation of application specific processors , 2003, CASES '03.