Area-Efficient Instruction Set Extension Exploration with Hardware Design Space Exploration

Instruction set extension (ISE) is an effective approach to improve the processor performance without tremendous modification in its core architecture. To execute ISE(s), a processor core must be augmented with a new functional unit, called application specific functional unit (ASFU), which consists of multiple hardware implementation options of ISEs (ISE_HW). Obviously, since ISE_HW increases the production cost of a processor core, minimizing the area size of ISE_HW becomes important for ISE exploration. On the other hand, because of different requirements in space and speed, ISE_HW usually has multiple hardware implementation options. Under pipeline-stage timing constraint, some of these options may have the same performance improvement but entail different hardware costs. According to this phenomenon, the area size of ISE_HW can be reduced by performing hardware design space exploration of ISE_HW. Therefore, in this paper, we propose an ISE exploration algorithm that explores not only ISE but also the hardware design space of ISE_HW. Compared with the previous research, our approach resulted in significant improvement in area efficiency and the execution performance.

[1]  Koen Bertels,et al.  The Instruction-Set Extension Problem: A Survey , 2008, TRETS.

[2]  Daoxiong Gong,et al.  A hybrid approach of GA and ACO for TSP , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[3]  Scott A. Mahlke,et al.  Automated custom instruction generation for domain-specific processor acceleration , 2005, IEEE Transactions on Computers.

[4]  T. C. May,et al.  Instruction-set matching and selection for DSP and ASIP code generation , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[5]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  Tulika Mitra,et al.  Disjoint Pattern Enumeration for Custom Instructions Identification , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[7]  Tony White,et al.  Using Genetic Algorithms to Optimize ACS-TSP , 2002, Ant Algorithms.

[8]  Srivaths Ravi,et al.  Custom-instruction synthesis for extensible-processor platforms , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Stamatis Vassiliadis,et al.  A Linear Complexity Algorithm for the Automatic Generation of Convex Multiple Input Multiple Output Instructions , 2007, ARC.

[10]  Gang Wang,et al.  Application partitioning on programmable platforms using the ant colony optimization , 2006, J. Embed. Comput..

[11]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[13]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[14]  Fadi J. Kurdahi,et al.  Partitioning by regularity extraction , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.