Behavioral optimization using the manipulation of timing constraints

We introduce a transformation, named rephasing, that manipulates the timing parameters in control-data-flow graphs (CDFG's) during the high-level synthesis of data-path-intensive applications. Timing parameters in such CDFG's include the sample period, the latencies between input-output pairs, the relative times at which corresponding samples become available on different inputs, and the relative times at which the corresponding samples become available at the delay nodes. While some of the timing parameters may be constrained by performance requirements, or by the interface to the external world, others remain free to be chosen during the process of high-level synthesis. Traditionally high-level synthesis systems for data-path-intensive applications either have assumed that all the relative times, called phases, when corresponding samples are available at input and delay nodes are zero (i.e., all input and delay node samples enter at the initial cycle of the schedule) or have automatically assigned values to these phases as part of the data-path allocation/scheduling step in the case of newer schedulers that use techniques like overlapped scheduling to generate complex time shapes. Rephasing, however, manipulates the values of these phases as an algorithm transformation before the scheduling/allocation stage. The advantage of this approach is that phase values can be chosen to transform and optimize the algorithm for explicit metrics such as area, throughput, latency, and power. Moreover, the rephasing transformation can be combined with other transformations such as algebraic transformations. We have developed techniques for using rephasing to optimize a variety of design metrics, and our results show significant improvements in several design metrics. We have also investigated the relationship and interaction of rephasing with other high-level synthesis tasks.

[1]  Keshab K. Parhi,et al.  Module selection and data format conversion for cost-optimal DSP synthesis , 1994, ICCAD '94.

[2]  Robert A. Walker,et al.  A Survey of high-level synthesis systems , 1991 .

[3]  Thomas P. Barnwell,et al.  Optimal automatic periodic multiprocessor scheduler for fully specified flow graphs , 1993, IEEE Trans. Signal Process..

[4]  Pierre G. Paulin,et al.  Scheduling and Binding Algorithms for High-Level Synthesis , 1989, 26th ACM/IEEE Design Automation Conference.

[5]  Miodrag Potkonjak,et al.  Fast prototyping of datapath-intensive architectures , 1991, IEEE Design & Test of Computers.

[6]  Miodrag Potkonjak,et al.  Critical Path Minimization Using Retiming and Algebraic Speed-Up , 1993, 30th ACM/IEEE Design Automation Conference.

[7]  Richard M. Karp,et al.  A characterization of the minimum cycle mean in a digraph , 1978, Discret. Math..

[8]  Niraj K. Jha,et al.  Behavioral Synthesis of Highly Testable Data Paths under the Non-Scan and Partial Scan Environments , 1993, 30th ACM/IEEE Design Automation Conference.

[9]  Daniel Gajski,et al.  An effective methodology for functional pipelining , 1992, ICCAD.

[10]  Keshab K. Parhi,et al.  Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.

[11]  Marios C. Papaefthymiou,et al.  Optimizing two-phase, level-clocked circuitry , 1997, JACM.

[12]  Alice C. Parker,et al.  Tutorial on high-level synthesis , 1988, DAC '88.

[13]  Miodrag Potkonjak,et al.  HYPER-LP: a system for power minimization using architectural transformations , 1992, ICCAD 1992.

[14]  Andreas Kuehlmann,et al.  A system for production use of high-level synthesis , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[15]  Leonard W. Cotten Circuit implementation of high-speed pipeline systems , 1965, AFIPS '65 (Fall, part I).

[16]  Peter Marwedel Tree-based mapping of algorithms to predefined structures , 1993, ICCAD.

[17]  Robert K. Brayton,et al.  Valid clocking in wavepipelined circuits , 1992, ICCAD.

[18]  K.-T. Cheng,et al.  A Partial Scan Method for Sequential Circuits with Feedback , 1990, IEEE Trans. Computers.

[19]  Giovanni De Micheli,et al.  Interface optimization for concurrent systems under timing constraints , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[20]  Peter Marwedel,et al.  Tree-based mapping of algorithms to predefined structures , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[21]  Charles N. Fischer,et al.  Crafting a Compiler , 1988 .

[22]  Ramesh Karri,et al.  Transformation-based high-level synthesis of fault-tolerant ASICs , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[23]  Miodrag Potkonjak,et al.  High level synthesis for reconfigurable datapath structures , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[24]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[25]  Wentai Liu,et al.  Wave Pipelining: Theory and CMOS Implementation , 1993 .

[26]  Joos Vandewalle,et al.  Loop Optimization in Register-Transfer Scheduling for DSP-Systems , 1989, 26th ACM/IEEE Design Automation Conference.

[27]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[28]  Michael J. Flynn,et al.  A bipolar population counter using wave pipelining to achieve 2.5* normal clock frequency , 1992 .

[29]  Miodrag Potkonjak,et al.  Pipelining: just another transformation , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.

[30]  S. F. Anderson,et al.  The IBM system/360 model 91: floating-point execution unit , 1967 .

[31]  Charles H. Stapper A new statistical approach for fault-tolerant VLSI systems , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[32]  Robert K. Brayton,et al.  Computing the initial states of retimed circuits , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[33]  Charles E. Leiserson,et al.  A TIMING ANALYSIS OF LEVEL-CLOCKED CIRCUITRY , 1990 .

[34]  Raymond Reiter,et al.  Scheduling Parallel Computations , 1968, J. ACM.

[35]  Markku Renfors,et al.  The maximum sampling rate of digital filters under hardware speed constraints , 1981 .

[36]  P. Marwedel,et al.  Cooperation of synthesis, retargetable code generation and test generation in the MSS , 1993, 1993 European Conference on Design Automation with the European Event in ASIC Design.

[37]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[38]  G. Dantzig,et al.  FINDING A CYCLE IN A GRAPH WITH MINIMUM COST TO TIME RATIO WITH APPLICATION TO A SHIP ROUTING PROBLEM , 1966 .

[39]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[40]  Sujit Dey,et al.  Performance Analysis and Optimization of Schedules for Conditional and Loop-Intensive Specifications , 1994, 31st Design Automation Conference.

[41]  Hugo De Man,et al.  Cathedral-III : architecture-driven high-level synthesis for high throughput DSP applications , 1991, 28th ACM/IEEE Design Automation Conference.

[42]  Daniel W. Dobberpuhl,et al.  The design and analysis of VLSI circuits , 1985 .

[43]  Daniel D. Gajski,et al.  An effective methodology for functional pipelining , 1992, 1992 IEEE/ACM International Conference on Computer-Aided Design.

[44]  Miodrag Potkonjak,et al.  Transforming linear systems for joint latency and throughput optimization , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[45]  Stephen H. Unger Tan: "Optimal Clocking Schemes for High Speed Digital Systems , 1983 .

[46]  Giovanni De Micheli,et al.  Inserting active delay elements to achieve wave pipelining , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[47]  Giovanni De Micheli,et al.  High Level Synthesis of ASlCs un - der Timing and Synchronization Constraints , 1992 .

[48]  Sabih H. Gerez,et al.  Range-chart-guided iterative data-flow graph scheduling , 1992 .