A Hierarchical Transistor and Gate-Level Statistical Timing Flow for Microprocessor Designs

This paper presents a hierarchical statistical timing flow for microprocessor designs with minimal run-time and memory overhead compared to a deterministic timing flow. Application of the flow to robust custom circuit design under variability is highlighted. I. CONTEXT High performance microprocessors contain custom designed circuit macros to achieve aggressive frequency targets. These custom designed circuits are typically timed using circuit simulation engines. Microprocessor designs can contain upwards of one billion transistors. Circuit simulation, while highly accurate, is run-time intensive and is not practical to use in a timing flow where chip level timing runs are made daily during the design cycle of the chip. This has led to the development of hierarchical contract-based timing where custom parts of the design are timed using transistor level timing tools [1] with circuit simulation type accuracy; followed by the generation of timing abstract models that reflect in a simpler and more compact form, the timing characteristic of the custom logic. The timing characteristics are captured by the use of slew and load dependent tables. Timing abstraction employs techniques to reduce the size of the timing graph by performing pruning as well as arc compression. These techniques can reduce the number of timing arcs to be analyzed at the next level of hierarchy (unit or chip level) significantly. Model reductions of 400% are common. At the chip level, custom logic macros are now represented by these abstracts and can be timed quickly without the use of circuit simulation. II. MOTIVATION Variability plays an increasingly important role within chip timing today, especially for high speed circuits like those used in microprocessor designs. Timing analysis considering variability (e.g., statistical timing) is now required as part of sign-off timing methodologies. In the case of designs containing standard cell libraries, modeling of statistical effects can be described within the characterized delay models. Statistical timing is traditionally based on finite differencing of timing quantities (like delays, slews, or timing waveforms) computed at multiple process corners. While this approach is usable in timing flows where delay calculations can be performed extremely rapidly, for example, using a table lookup delay calculator like .lib, such an approach is not feasible in custom transistor level timing due to the extreme simulation run-times. Additionally, most transistor level timing tools are not variation aware; and are only able to perform deterministic timing today. Guard bands in the form of large timing margins are instead applied …

[1]  Bill Dewey,et al.  Transistor-Level Tools for High-End Processor Custom Circuit Design at IBM , 2007, Proceedings of the IEEE.

[2]  Natesan Venkateswaran,et al.  First-Order Incremental Block-Based Statistical Timing Analysis , 2006, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..