Considering the effect of process variations during the ISA extension design flow

In this paper, we present a technique for custom instruction (CI) extension considering process variations. The technique bridges the gap between the high level custom instruction extension and chip fabrication in nanotechnologies. In particular, instead of using the conventional Static Timing Analysis (STA), it utilizes Statistical Static Timing Analysis (SSTA). Therefore, the approach becomes probabilistic where the delay of each CI is modeled by a Probability Density Function (PDF). Using this probabilistic approach, different subsets of the CIs extension are identified to meet predefined constraints (identification phase) and eventually selected for realization to improve a given merit function (selection phase). In the identification phase, performance yield under both random and systematic variations is added as a constraint. Also, a pruning technique is proposed to decrease the runtime of the systematic variation modeling. The results show that the technique reduces the number of the CIs which need systematic variation modeling by about 24.6% for the cases studied in this work. In the selection phase, both greedy and branch-and-bound approaches are used. In the greedy approach, the conventional merit function based on the cycle saving and area is modified to include the performance yield. The results show the proposed merit function leads to about 3.2% increasing in the speedup. In the branch-and-bound method an effective pruning technique is described to reduce the runtime. The pruning technique is able to reduce the search space about 62%.

[1]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[2]  Mehdi Kamal,et al.  Timing variation-aware custom instruction extension technique , 2011, 2011 Design, Automation & Test in Europe.

[3]  Josep Torrellas,et al.  ReCycle:: pipeline adaptation to tolerate process variation , 2007, ISCA '07.

[4]  Scott A. Mahlke,et al.  Automated custom instruction generation for domain-specific processor acceleration , 2005, IEEE Transactions on Computers.

[5]  Yuan Xie,et al.  Statistical High-Level Synthesis under Process Variability , 2009, IEEE Design & Test of Computers.

[6]  David M. Brooks,et al.  Mitigating the Impact of Process Variations on Processor Register Files and Execution Units , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[7]  Josep Torrellas,et al.  Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[8]  Rajendran Panda,et al.  Statistical timing analysis using bounds and selective enumeration , 2002, TAU '02.

[9]  David Blaauw,et al.  Statistical timing analysis using bounds and selective enumeration , 2003, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[10]  Tilman Wolf,et al.  PacketBench: a tool for workload characterization of network processing , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).

[11]  Yuan Xie,et al.  A Variation Aware High Level Synthesis Framework , 2008, 2008 Design, Automation and Test in Europe.

[12]  Koen Bertels,et al.  The Instruction-Set Extension Problem: A Survey , 2008, TRETS.

[13]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[14]  James D. Meindl,et al.  Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration , 2002, IEEE J. Solid State Circuits.

[15]  Yuan Xie,et al.  Variation-aware resource sharing and binding in behavioral synthesis , 2009, 2009 Asia and South Pacific Design Automation Conference.

[16]  Kelin Kuhn,et al.  Managing Process Variation in Intel’s 45nm CMOS Technology , 2008 .

[17]  Günhan Dündar,et al.  An integer linear programming approach for identifying instruction-set extensions , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[18]  Wayne Luk,et al.  Fast custom instruction identification by convex subgraph enumeration , 2008, 2008 International Conference on Application-Specific Systems, Architectures and Processors.

[19]  Li Shen,et al.  Optimal subgraph covering for customisable VLIW processors , 2009, IET Comput. Digit. Tech..

[20]  Zeshan Chishti,et al.  Shapeshifter: Dynamically changing pipeline width and speed to address process variations , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[21]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[22]  Hai Zhou,et al.  Fast Estimation of Timing Yield Bounds for Process Variations , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Josep Torrellas,et al.  Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[24]  Scott A. Mahlke,et al.  Scalable subgraph mapping for acyclic computation accelerators , 2006, CASES '06.

[25]  Paolo Ienne,et al.  Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Kaushik Roy,et al.  Trifecta: A Nonspeculative Scheme to Exploit Common, Data-Dependent Subcritical Paths , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Paolo Ienne,et al.  Rethinking custom ISE identification: a new processor-agnostic method , 2007, CASES '07.

[28]  Tao Li,et al.  Fast enumeration of maximal valid subgraphs for custom-instruction identification , 2009, CASES '09.

[29]  Paolo Bonzini,et al.  Recurrence-Aware Instruction Set Selection for Extensible Embedded Processors , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[30]  Masoud Dehyadegari,et al.  Dual-purpose custom instruction identification algorithm based on Particle Swarm Optimization , 2010, ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors.