Balanced Scheduling and Operation Chaining in High-Level Synthesis for FPGA Designs

In high-level synthesis for FPGA designs, scheduling and chaining of operations for optimal performance remain challenging problems. In this paper, we present a balanced scheduling routine that uniformly distributes operations across states to reduce critical timing paths in the absence of accurate functional unit delay models. On average, results show improvements in frequency and run times for balanced scheduling over ASAP, ALAP, and force-directed scheduling. Additionally, we provide a methodology for precision-based delay modeling of operations. We present a balanced chaining routine that, given a target frequency, uses this modeling technique to reduce the number of clock cycles in the design. Results show approximately 20% improvement on average in run times when incorporating our balanced chaining routine with scheduling. Applying balanced chaining in a high-level synthesis tool allowed performance improvements between 8-29x for large, complex applications. Our method for modeling operation delays is shown to be accurate in estimating delays for operation chaining during high-level synthesis

[1]  Fadi J. Kurdahi,et al.  Area and timing estimation for lookup table based FPGAs , 1996, Proceedings ED&TC European Design and Test Conference.

[2]  Farid N. Najm,et al.  Delay estimation of VLSI circuits from a high-level view , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[3]  Robert K. Brayton,et al.  Retiming and resynthesis: optimizing sequential networks with combinational techniques , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[4]  Hai Zhou,et al.  How Powerful is Retiming , 1998 .

[5]  Alok N. Choudhary,et al.  Accurate area and delay estimators for FPGAs , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[6]  Christos A. Papachristou,et al.  False path exclusion in delay analysis of RTL-based datapath-controller designs , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[7]  Pierre G. Paulin,et al.  Scheduling and Binding Algorithms for High-Level Synthesis , 1989, 26th ACM/IEEE Design Automation Conference.

[8]  Prithviraj Banerjee,et al.  Overview of the FREEDOM compiler for mapping DSP software to FPGAs , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[9]  Prithviraj Banerjee,et al.  Macro-models for high level area and power estimation on FPGAs , 2004, GLSVLSI '04.

[10]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[11]  Emile H. L. Aarts,et al.  Efficiency improvements for force-directed scheduling , 1992, ICCAD.

[12]  Prithviraj Banerjee,et al.  Automatic translation of software binaries onto FPGAs , 2004, Proceedings. 41st Design Automation Conference, 2004..

[13]  Daniela De Venuto,et al.  International Symposium on Quality Electronic Design , 2005, Microelectron. J..

[14]  Arvind Srinivasan,et al.  Accurate area and delay estimation from RTL descriptions , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[15]  A. Sangiovanni-Vincentelli,et al.  Retiming and resynthesis: optimizing sequential networks with combinational techniques , 1990, Twenty-Third Annual Hawaii International Conference on System Sciences.

[16]  Prithviraj Banerjee,et al.  Macro-models for high-level area and power estimation on FPGAs , 2006, Int. J. Simul. Process. Model..

[17]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[18]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[19]  Susan J. Eggers,et al.  Balanced scheduling: instruction scheduling when memory latency is uncertain , 2004, SIGP.