HLS-l: A High-Level Synthesis Framework for Latch-Based Architectures

Level-sensitive latches are widely used in high-performance custom designs while edge-triggered flip-flops are predominantly used in application-specific integrated circuits. We consider a latch as a basis for storage and address each step of high-level synthesis (HLS), including scheduling, allocation, and control synthesis. While the use of latches provides an opportunity to reduce the latency during the scheduling, the register allocation has to take extra conflicts caused by latch into account, and the control synthesis has to be tailored to support the latch-based data-path. Optimization potentials specific to this HLS are identified and solutions are proposed. Specifically, the register allocation can be improved by refining the operation schedule in a way to reduce the number of edges in a register conflict graph; the latency can be reduced by adjusting the clock duty cycle in a way to generate a tighter schedule. All the steps of HLS and optimization procedures were integrated into a framework called HLS-l. It was tested on benchmark designs implemented in 1.1-V, 45 nm complementary metal-oxide-semiconductor technology. Compared to the conventional HLS, HLS-l was able to reduce the latency by 18.2% on average with 9.2% less area and 16.0% less power consumption. The application of HLS-l to an industrial example is demonstrated through the design of a module extracted from H.264/advanced video coding.

[1]  Jason Cong,et al.  Register binding and port assignment for multiplexer optimization , 2004 .

[2]  D. Atkin OR scheduling algorithms. , 2000, Anesthesiology.

[3]  Shen-Iuan Liu,et al.  All-digital delay-locked loop/pulsewidth-control loop with adjustable duty cycles , 2006, IEEE Journal of Solid-State Circuits.

[4]  Akihiro Hashimoto,et al.  Wire routing by optimizing channel assignment within large apertures , 1971, DAC.

[5]  Hiroyuki Sugiyama,et al.  A 1.3 GHz fifth generation SPARC64 microprocessor , 2003 .

[6]  Kiyoung Choi,et al.  Performance-driven high-level synthesis with bit-level chaining andclock selection , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  Chong-Min Kyung,et al.  Low-power high-level synthesis using latches , 2001, ASP-DAC '01.

[8]  Daniel Brélaz,et al.  New methods to color the vertices of a graph , 1979, CACM.

[9]  Fadi J. Kurdahi,et al.  REAL: A Program for REgister ALlocation , 1987, 24th ACM/IEEE Design Automation Conference.

[10]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[12]  Ashutosh Das,et al.  A new family of semidynamic and dynamic flip-flops with embedded logic for high-performance processors , 1999 .

[13]  V.S. Sathe,et al.  Resonant-Clock Latch-Based Design , 2008, IEEE Journal of Solid-State Circuits.

[14]  David G. Chinnery,et al.  Achieving 550 MHz in an ASIC methodology , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[15]  James A. Hutchby,et al.  Limits to binary logic switch scaling - a gedanken model , 2003, Proc. IEEE.

[16]  Gu-Yeon Wei,et al.  A Process-Variation-Tolerant Floating-Point Unit with Voltage Interpolation and Variable Latency , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[17]  Sang-Hun Park,et al.  Incremental Analysis and Elaboration of VHDL Description , 1996 .

[18]  Anshul Kumar,et al.  Optimal clock period for synthesized data paths , 1997, Proceedings Tenth International Conference on VLSI Design.

[19]  Peter Brucker,et al.  Scheduling Algorithms , 1995 .

[20]  Chingwei Yeh,et al.  An 830mW, 586kbps 1024-bit RSA chip design , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[21]  Yu-Chin Hsu,et al.  A formal approach to the scheduling problem in high level synthesis , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[22]  Vladimir Stojanovic,et al.  StateoftheArt Clocked Storage Elements in CMOS Technology , 2003 .

[23]  Youngsoo Shin,et al.  HLS-l: High-level synthesis of high performance latch-based circuits , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[24]  Preeti Ranjan Panda,et al.  Rapid estimation of control delay from high-level specifications , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[25]  Taemin Kim,et al.  Compatibility path based binding algorithm for interconnect reduction in high level synthesis , 2007, ICCAD 2007.

[26]  Daniel Gajski,et al.  System clock estimation based on clock slack minimization , 1992, Proceedings EURO-DAC '92: European Design Automation Conference.

[27]  T. Higashi,et al.  Flip-flop selection technique for power-delay trade-off [video codec] , 1999, 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278).

[28]  Manoj Sachdev,et al.  Low power, testable dual edge triggered flip-flops , 1996, Proceedings of 1996 International Symposium on Low Power Electronics and Design.

[29]  Hiroyuki Sugiyama,et al.  A 1.3 GHz fifth generation SPARC64 microprocessor , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[30]  Youn-Long Lin,et al.  Storage optimization by replacing some flip-flops with latches , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[31]  Alice C. Parker,et al.  The high-level synthesis of digital systems , 1990, Proc. IEEE.

[32]  Miodrag Potkonjak,et al.  Performance optimization using template mapping for datapath-intensive high-level synthesis , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[33]  K.A. Jenkins,et al.  A clock distribution network for microprocessors , 2000, 2000 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.00CH37103).

[34]  B. Korte,et al.  Clock scheduling and clocktree construction for high performance ASICs , 2003, ICCAD-2003. International Conference on Computer Aided Design (IEEE Cat. No.03CH37486).

[35]  Yuan Xie,et al.  Tolerating process variations in high-level synthesis using transparent latches , 2009, 2009 Asia and South Pacific Design Automation Conference.

[36]  Samuel D. Naffziger,et al.  The implementation of the Itanium 2 microprocessor , 2002, IEEE J. Solid State Circuits.

[37]  T. C. Hu Parallel Sequencing and Assembly Line Problems , 1961 .