Parallel custom instruction identification for extensible processors

With the ability of customization for an application domain, extensible processors have been used more and more in embedded systems in recent years. Extensible processors customize an application domain by executing parts of application code in hardware instead of software. Determining parts of application code as custom instruction generally requires subgraph enumeration and subgraph selection. Both subgraph enumeration problem and subgraph selection problem are computationally difficult problems. Most of previous works focus on sequential algorithms for these two problems. In this paper, we present a parallel implementation of a latest subgraph enumeration algorithm based on a computer cluster. A standard ant colony optimization algorithm (ACO), a modified version of ACO with local optimum search and a parallel ACO algorithm are also proposed to solve the subgraph selection problem in this work. Experimental results show that the parallel algorithms outperform the sequential algorithms in terms of runtime or (and) quality of results. In addition, we have formally proved the upper bound on the number of feasible solutions in subgraph selection problem with or without the overlapping constraint.

[1]  Scott A. Mahlke,et al.  Scalable subgraph mapping for acyclic computation accelerators , 2006, CASES '06.

[2]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[3]  Douglas L. Maskell,et al.  Fast Identification of Custom Instructions for Extensible Processors , 2007, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Paolo Bonzini,et al.  Recurrence-Aware Instruction Set Selection for Extensible Embedded Processors , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[7]  Koen Bertels,et al.  The Instruction-Set Extension Problem: A Survey , 2008, ARC.

[8]  Günhan Dündar,et al.  An integer linear programming approach for identifying instruction-set extensions , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[9]  Gregory Gutin,et al.  Algorithms for generating convex sets in acyclic digraphs , 2007, J. Discrete Algorithms.

[10]  Anshul Kumar,et al.  Application Specific Datapath Extension with Distributed I/O Functional Units , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).

[11]  Zhiyuan Chen,et al.  Instruction Set Extension Exploration in Multiple-Issue Architecture , 2008, 2008 Design, Automation and Test in Europe.

[12]  Laura Pozzi,et al.  Maximum Convex Subgraphs Under I/O Constraint for Automatic Identification of Custom Instructions , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  Scott A. Mahlke,et al.  Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[14]  Tao Li,et al.  Fast enumeration of maximal valid subgraphs for custom-instruction identification , 2009, CASES '09.

[15]  Dilip K. Banerji,et al.  Instruction-set matching and GA-based selection for embedded-processor code generation , 1996, Proceedings of 9th International Conference on VLSI Design.

[16]  Wayne Luk,et al.  FISH: Fast Instruction SyntHesis for Custom Processors , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[17]  Wayne Luk,et al.  Fast custom instruction identification by convex subgraph enumeration , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[18]  Thambipillai Srikanthan,et al.  Instruction set customization for area-constrained FPGA designs , 2011, 2011 IEEE International SOC Conference.

[19]  Tao Li,et al.  Selecting profitable custom instructions for reconfigurable processors , 2010, J. Syst. Archit..

[20]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[21]  Emmanuel Casseau,et al.  An efficient algorithm for custom instruction enumeration , 2011, GLSVLSI '11.

[22]  Stamatis Vassiliadis,et al.  A Linear Complexity Algorithm for the Generation of Multiple Input Single Output Instructions of Variable Size , 2007, SAMOS.

[23]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[24]  B. Bullnheimer,et al.  A NEW RANK BASED VERSION OF THE ANT SYSTEM: A COMPUTATIONAL STUDY , 1997 .

[25]  Jason Cong,et al.  Application-specific instruction generation for configurable processor architectures , 2004, FPGA '04.

[26]  David R. Kaeli,et al.  Removing communications in clustered microarchitectures through instruction replication , 2004, TACO.

[27]  Shanshan Wang,et al.  Automatic custom instruction identification for application-specific instruction set processors , 2014, Microprocess. Microsystems.

[28]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[29]  Agostino Poggi,et al.  Developing Multi-agent Systems with JADE , 2007, ATAL.

[30]  Xiao Yang,et al.  Selecting most profitable instruction-set extensions using ant colony heuristic , 2015, 2015 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[31]  Majid Sarrafzadeh,et al.  Instruction generation for hybrid reconfigurable systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[32]  Tulika Mitra,et al.  Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.

[33]  François Charot,et al.  Constraint-Driven Instructions Selection and Application Scheduling in the DURASE system , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.

[34]  François Charot,et al.  Constraint Programming Approach to Reconfigurable Processor Extension Generation and Application Compilation , 2012, TRETS.

[35]  Tulika Mitra,et al.  Satisfying real-time constraints with custom instructions , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[36]  Nadia Nedjah,et al.  Modern development methods and tools for embedded reconfigurable systems: A survey , 2010, Integr..

[37]  Paolo Ienne,et al.  Rethinking custom ISE identification: a new processor-agnostic method , 2007, CASES '07.

[38]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..