Global scheduling with code-motions for high-level synthesis applications

In this paper, we present a global scheduling technique for synthesis applications. The algorithm accepts a specification containing conditional branches and while-loop constructs and schedules it for a given set of resources. The algorithm performs several types of code motions across different basic blocks and trades off cost with performance. Several real-life examples taken from Numerical Recipes in C are used to demonstrate the efficacy of the approach. The results indicate that code-motions are very important for achieving significant speed-ups for synthesis applications. >

[1]  Alok Sharma,et al.  Estimating architectural resources and performance for high-level synthesis applications , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[2]  J. F. Wang,et al.  A Tree-Based Scheduling Algorithm for Control-Dominated Circuits , 1993, 30th ACM/IEEE Design Automation Conference.

[3]  Michael Rodeh,et al.  Global instruction scheduling for superscalar machines , 1991, PLDI '91.

[4]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[6]  Scott A. Mahlke,et al.  Reverse If-Conversion , 1993, PLDI '93.

[7]  Alexandru Nicolau,et al.  Percolation based synthesis , 1991, DAC '90.

[8]  Yu-Chin Hsu,et al.  Zone scheduling , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[9]  Catherine H. Gebotys,et al.  Optimal synthesis of high-performance architectures , 1992 .

[10]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[11]  Minjoong Rim,et al.  Valid Transformations: A New Class of Loop Transformations for High-Level Synthesis and Pipelined Scheduling Applications , 1996, IEEE Trans. Parallel Distributed Syst..

[12]  Kazutoshi Wakabayashi,et al.  Global scheduling independent of control dependencies based on condition vectors , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[13]  Bruce D. Shriver,et al.  Some Experiments in Local Microcode Compaction for Horizontal Machines , 1981, IEEE Transactions on Computers.

[14]  John Mitchell,et al.  The structure of assignment, precedence, and resource constraints in the ILP approach to the scheduling problem , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[15]  Alice C. Parker,et al.  MAHA: A Program for Datapath Synthesis , 1986, DAC 1986.

[16]  Rajiv Jain,et al.  Experience with the ADAM Synthesis System , 1989, 26th ACM/IEEE Design Automation Conference.

[17]  Minjoong Rim,et al.  Lower-bound performance estimation for the high-level synthesis scheduling problem , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[18]  Monica S. Lam,et al.  Limits of control flow on parallelism , 1992, ISCA '92.

[19]  Scott A. Mahlke,et al.  Sentinel scheduling for VLIW and superscalar processors , 1992, ASPLOS V.

[20]  Miodrag Potkonjak,et al.  Fast prototyping of datapath-intensive architectures , 1991, IEEE Design & Test of Computers.

[21]  John Paul Shen,et al.  Architecture synthesis of high-performance application-specific processors , 1991, DAC '90.

[22]  Michael C. McFarland,et al.  Incorporating bottom-up design into hardware synthesis , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[23]  Barry M. Pangrle,et al.  Global mobility based scheduling , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[24]  Alice C. Parker,et al.  Sehwa: a software package for synthesis of pipelines from behavioral specifications , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[25]  Ted G. Lewis,et al.  Parallelizing WHILE Loops , 1990, ICPP.

[26]  Minjoong Rim,et al.  RECALS II: a new list scheduling algorithm , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Rudolf Eigenmann,et al.  Automatic program parallelization , 1993, Proc. IEEE.

[28]  Yu-Chin Hsu,et al.  A formal approach to the scheduling problem in high level synthesis , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[29]  Taewhan Kim,et al.  A scheduling algorithm for conditional resource sharing , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[30]  Raul Camposano,et al.  Path-based scheduling for synthesis , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[31]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[32]  Jos Huisken,et al.  PHIDEO: a silicon compiler for high speed algorithms , 1991, Proceedings of the European Conference on Design Automation..

[33]  Rajiv Gupta,et al.  Region Scheduling: An Approach for Detecting and Redistributing Parallelism , 1990, IEEE Trans. Software Eng..

[34]  Daniel Gajski,et al.  Design Tools for Intelligent Silicon Compilation , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[35]  Alexander Aiken,et al.  A Development Environment for Horizontal Microcode , 1986, IEEE Trans. Software Eng..

[36]  Chu Shik Jhon,et al.  A Branch-and-bound Method For The Optimal Scheduling , 1992 .

[37]  Susan J. Eggers,et al.  Balanced scheduling: instruction scheduling when memory latency is uncertain , 1993, PLDI '93.

[38]  Kazutoshi Wakabayashi,et al.  A resource sharing and control synthesis method for conditional branches , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[39]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[40]  Alok Sharma,et al.  InSyn: integrated scheduling for DSP applications , 1995, IEEE Trans. Signal Process..

[41]  Alice C. Parker,et al.  A Formal Method for the Specification, Analysis, and Design of Register-Transfer Level Digital Logic , 1983, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[42]  Minjoong Rim,et al.  Representing conditional branches for high-level synthesis applications , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[43]  Michael D. Smith,et al.  Efficient superscalar performance through boosting , 1992, ASPLOS V.

[44]  H. Shin,et al.  A cost function based optimization technique for scheduling in data path synthesis , 1989, Proceedings 1989 IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[45]  Eduard Cerny,et al.  A recursive technique for computing lower-bound performance of schedules , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[46]  P. Tirumalai,et al.  Software pipelining and superblock scheduling: compilation techniques for VLIW machines , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[47]  Tai A. Ly,et al.  Applying simulated evolution to high level synthesis , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[48]  John A. Nestor,et al.  SALSA: a new approach to scheduling with timing constraints , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[49]  Kemal Ebcioglu,et al.  An efficient resource-constrained global scheduling technique for superscalar and VLIW processors , 1992, MICRO 1992.

[50]  Joos Vandewalle,et al.  An efficient microcode compiler for application specific DSP processors , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..