Energy-efficient High-level Synthesis for HDR Architectures with Clock Gating Based on Concurrency-oriented Scheduling

With the miniaturization of LSIs and its increasing performance, demand for high-functional portable devices has grown significantly. At the same time, battery lifetime and device overheating are leading to major design problems hampering further LSI integration. On the other hand, the ratio of an interconnection delay to a gate delay has continued to increase as device feature size decreases. We have to estimate interconnection delays and reduce energy consumption even in a high-level synthesis stage. In this paper, we propose a high-level synthesis algorithm for huddle-based distributed-register architectures (HDR architectures) with clock gatings based on concurrency-oriented scheduling/functional unit binding. We assume coarse-grained clock gatings to huddles and we focus on the number of control steps, or gating steps, at which we can apply the clock gating to registers in every huddle. We propose two methods to increase gating steps: One is that we try to schedule and bind operations to be performed at the same timing. By adjusting the clock gating timings in a high-level synthesis stage, we expect that we can enhance the effect of clock gatings more than applying clock gatings after logic synthesis. The other is that we try to synthesize huddles such that each of the synthesized huddles includes registers which have similar or the same clock gating timings. At this time, we determine the clock gating timings to minimize all energy consumption including clock tree energy. The experimental results show that our proposed algorithm reduces energy consumption by a maximum of 23.8% compared with several conventional algorithms.

[1]  Hiroshi Nakamura,et al.  SLD-1(Silent Large Datapath): A ultra low power reconfigurable accelerator , 2011, 2011 IEEE Cool Chips XIV.

[2]  Nozomu Togawa,et al.  Floorplan-Aware High-Level Synthesis for Generalized Distributed-Register Architectures , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[3]  Nozomu Togawa,et al.  Energy-efficient high-level synthesis for HDR architectures with clock gating , 2012, 2012 International SoC Design Conference (ISOCC).

[4]  Malgorzata Marek-Sadowska,et al.  Low-power buffered clock tree design , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  Dawei Huang,et al.  A 40nm 16-core 128-thread SPARC® SoC processor , 2010, 2010 IEEE Asian Solid-State Circuits Conference.

[6]  Lan-Rong Dung,et al.  On multiple-voltage high-level synthesis using algorithmic transformations , 2005, ASP-DAC.

[7]  Frank Emnett,et al.  Power Reduction Through RTL Clock Gating , 2001 .

[8]  S. Qureshi,et al.  Power and performance optimization using multi-voltage, multi-threshold and clock gating for low-end microprocessors , 2009, TENCON 2009 - 2009 IEEE Region 10 Conference.

[9]  Nozomu Togawa,et al.  An energy-efficient high-level synthesis algorithm for huddle-based distributed-register architectures , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[10]  Jason Cong,et al.  Architecture and synthesis for on-chip multicycle communication , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Nozomu Togawa,et al.  Performance-driven high-level synthesis with floorplan for GDR architectures and its evaluation , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.