High-level synthesis with behavioral level multi-cycle path analysis

High-level synthesis (HLS) tools generate register transfer level (RTL) hardware descriptions through a process of resource allocation, scheduling and binding. Intuitively, RTL quality influences the logic synthesis quality. Specifically, the achievable clock rate, area, and latency in clock cycles will be determined by the RTL description. However, not all paths should receive equal logic synthesis effort - multi-cycle paths represent an opportunity to spend logic synthesis effort elsewhere to achieve better design quality. In this paper, we perform multi-cycle optimisation on chained functional operations. We couple HLS and logic synthesis synergistically so multi-cycle paths can be identified and optimised coherently across both behavioral and logic levels. In addition, we perform multi-cycle path analysis at the behavioral level efficiently. We prove that our technique examines all reachable circuit state and finds multi-cycle paths including control flow and guarding conditions that improve the flexibility and power of the technique. Compared to LegUp, we achieve average 55% execution time improvement, 29% area improvement, and 68% time-area product improvement targeting FPGA architecture.

[1]  John Wawrzynek,et al.  OpenRCL: Low-Power High-Performance Computing with Reconfigurable Devices , 2010, 2010 International Conference on Field Programmable Logic and Applications.

[2]  Satnam Singh,et al.  Kiwi: Synthesis of FPGA Circuits from Parallel Programs , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[3]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[4]  Kiyoung Choi,et al.  High-level synthesis under multi-cycle interconnect delay , 2001, ASP-DAC '01.

[5]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[6]  Jason Cong,et al.  Bit-level optimization for high-level synthesis and FPGA-based acceleration , 2010, FPGA '10.

[7]  Daniel Kroening,et al.  Fixed points for multi-cycle path detection , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[8]  Miodrag Potkonjak,et al.  Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[9]  Sharad Malik,et al.  Exploiting multicycle false paths in the performance optimization of sequential logic circuits , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[10]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[11]  Arvind,et al.  Synthesis from multi-cycle atomic actions as a solution to the timing closure problem , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[12]  Hiroyuki Tomiyama,et al.  CHStone: A benchmark program suite for practical C-based high-level synthesis , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[13]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[14]  Satnam Singh,et al.  FPGA Circuit Synthesis of Accelerator Data-Parallel Programs , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[15]  Jason Cong,et al.  AutoPilot: A Platform-Based ESL Synthesis System , 2008 .

[16]  Martin D. F. Wong,et al.  Timing constraint-driven technology mapping for FPGAs considering false paths and multi-clock domains , 2007, 2007 IEEE/ACM International Conference on Computer-Aided Design.

[17]  Jason Cong,et al.  FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs , 2009, 2009 IEEE 7th Symposium on Application Specific Processors.

[18]  John Freeman,et al.  From opencl to high-performance hardware on FPGAS , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[19]  Eric Senn,et al.  ∂ GAUT: A High-Level Synthesis Tool for DSP applications , 2008 .

[20]  Joshua S. Auerbach,et al.  Lime: a Java-compatible and synthesizable language for heterogeneous architectures , 2010, OOPSLA.

[21]  Jason Helge Anderson,et al.  LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.

[22]  Jason Cong,et al.  An efficient and versatile scheduling algorithm based on SDC formulation , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[23]  Mary Sheeran,et al.  Lava: hardware design in Haskell , 1998, ICFP '98.

[24]  Kees A. Vissers,et al.  Optimized generation of data-path from C codes for FPGAs , 2005, Design, Automation and Test in Europe.

[25]  Hiroyuki Higuchi,et al.  Enhancing the performance of multi-cycle path analysis in an industrial setting , 2004 .

[26]  Rishiyur S. Nikhil,et al.  Bluespec System Verilog: efficient, correct RTL from high level specifications , 2004, Proceedings. Second ACM and IEEE International Conference on Formal Methods and Models for Co-Design, 2004. MEMOCODE '04..

[27]  Dihu Chen,et al.  A gradual scheduling framework for problem size reduction and cross basic block parallelism exploitation in high-level synthesis , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).

[28]  Jason Cong,et al.  Architecture and synthesis for on-chip multicycle communication , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[29]  Jason Cong,et al.  xPilot: A Platform-Based Behavioral Synthesis System , 2005 .

[30]  Robert K. Brayton,et al.  Performance Optimization Using Exact Sensitization , 1994, 31st Design Automation Conference.

[31]  Kazuyoshi Takagi,et al.  Waiting false path analysis of sequential logic circuits for performance optimization , 1998, ICCAD.

[32]  Jan Kuper,et al.  C?aSH: Structural Descriptions of Synchronous Hardware Using Haskell , 2010, 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools.

[33]  Philippe Coussy,et al.  High-Level Synthesis: from Algorithm to Digital Circuit , 2008 .