Interlocked synchronous pipelines
暂无分享,去创建一个
Modern digital VLSI design is facing significant challenges as physical limitations are placing an increasing number of constraints on the design process. Dynamic power dissipation, global signal distribution, and simultaneous switching noise are three constraints arguably affected the most by the continuing increase in on-chip functions, deeper pipelining, and clock frequency.
As the number of latches grow as a result of more on-chip functions and deeper pipelining, power dissipation is breaking the thermal envelope for cost-effective power distribution, packaging, and cooling solutions. Unconstrained clock power dissipation accounts for the majority of the total power dissipation in modern microprocessor design. Pervasive clock gating at a fine granularity is key in constraining power dissipation. At the same time, shorter cycle times and increasing delay on global wires are limiting the amount of logic that can be reached within one clock cycle. Stalling synchronous pipelines in particular is becoming a significant challenge as the stall signals have to be distributed at a global level. To avoid affecting cycle time it is important to find cost effective solutions to stall synchronous pipelines progressively, at the local stage level. Clock gating and pipeline stalling are in turn causing concern as simultaneous switching noise is affected. To reduce the effect on high frequency variance in current demand, the ability to perform both clock gating and stalling at a fine granularity, such as the pipeline stage level, is becoming increasingly important.
We present a novel technique, Interlocked Synchronous Pipelines (ISP), that simultaneously helps address the above mentioned problems by providing stage level interlocking in synchronous pipelines without incurring area or throughput penalties. The ISP technique provides optimal clock gating at the stage level and offers progressive stalling of pipelines, one stage per clock cycle. Clock gating and stalling at the fine grained stage level helps reduce clock power and cycle to cycle variance in current demand and also improves delay on clock gating and stall signals. In addition, ISP offers dual data storage in master/slave registers that can be used to improve storage properties of queue structures without increasing area or power. ISP has been applied in the design of a deeply pipelined high frequency multiply/add-accumulate unit and shown significant reductions in dynamic power dissipation, cycle to cycle variance in current demand (di/dt), circuit area, and stall signal delay.