FACT: a framework for applying throughput and power optimizing transformations to control-flow-intensive behavioral descriptions

In this paper, we present an algorithm for the application of a general class of transformations to control-flow intensive behavioral descriptions. Our algorithm is based on the observation that incorporation of scheduling information can help guide the selection and application of candidate transformations, and significantly enhance the quality of the synthesized solution. The efficacy of the selected throughput and power optimizing transformations is enhanced by the ability of our algorithm to transcend basic blocks in the behavioral description. This ability is imparted to our algorithm by a general technique we have devised. Our system currently supports associativity, commutativity, distributility, constant propagation, code motion, and loop unrolling. It is integrated with a scheduler which performs implicit loop unrolling and functional pipelining, and has the ability to parallelize the execution of independent iterative constructs, whose bodies can share resources. Other transformations can easily be incorporated within the framework. We demonstrate the efficacy of our algorithm by applying it to several commonly available benchmarks. Upon synthesis, behaviors transformed by the application of our algorithm showed, on an average, a 2.5-fold improvement in throughput over an existing transformation algorithm, and a 57.6% improvement in power over designs produced without the benefit of our algorithm.

[1]  Sujit Dey,et al.  Performance Analysis and Optimization of Schedules for Conditional and Loop-Intensive Specifications , 1994, 31st Design Automation Conference.

[2]  Neil Weste,et al.  Principles of CMOS VLSI Design , 1985 .

[3]  Howard Trickey,et al.  Flamel: A High-Level Hardware Compiler , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Miodrag Potkonjak,et al.  Optimizing power using transformations , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  Roger A. Bringmann Enhancing instruction level parallelism through compiler-controlled speculation , 1995 .

[6]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[7]  David J. Kuck,et al.  High Performance Computing: Challenges for Future Systems , 1996 .

[8]  Alexandru Nicolau,et al.  Incremental tree height reduction for high level synthesis , 1991, 28th ACM/IEEE Design Automation Conference.

[9]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[10]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[11]  Niraj K. Jha,et al.  IMPACT: A high-level synthesis system for low power control-flow intensive circuits , 1998, Proceedings Design, Automation and Test in Europe.

[12]  Ken Kennedy,et al.  Interprocedural transformations for parallel code generation , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[13]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[14]  Donald E. Thomas,et al.  Behavioral transformation for algorithmic level IC design , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[15]  Jian Li,et al.  HDL optimization using timed decision tables , 1996, 33rd Design Automation Conference Proceedings, 1996.

[16]  Alfred V. Aho,et al.  Principles of Compiler Design , 1977 .

[17]  M. Potkonjak,et al.  Maximally fast and arbitrarily fast implementation of linear computations (circuit layout CAD) , 1992, 1992 IEEE/ACM International Conference on Computer-Aided Design.

[18]  Keshab K. Parhi,et al.  High-level DSP synthesis using concurrent transformations, scheduling, and allocation , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[19]  Scott A. Mahlke,et al.  Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..

[20]  Bozena Kaminska,et al.  Functional synthesis of digital systems with TASS , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  Abhijit Chatterjee,et al.  Synthesis of low power linear DSP circuits using activity metrics , 1994, Proceedings of 7th International Conference on VLSI Design.

[22]  Miodrag Potkonjak,et al.  Maximally fast and arbitrarily fast implementation of linear computations , 1992, ICCAD '92.

[23]  Richard E. Hank,et al.  Region-based compilation , 1996 .

[24]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[25]  Minh N. Do,et al.  Youn-Long Steve Lin , 1992 .

[26]  Alexander Aiken,et al.  Resource-Constrained Software Pipelining , 1995, IEEE Trans. Parallel Distributed Syst..

[27]  Miodrag Potkonjak,et al.  Potential-driven statistical ordering of transformations , 1997, DAC.

[28]  Niraj K. Jha,et al.  Wavesched: a novel scheduling technique for control-flow intensive behavioral descriptions , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[29]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[30]  Miodrag Potkonjak,et al.  Critical Path Minimization Using Retiming and Algebraic Speed-Up , 1993, 30th ACM/IEEE Design Automation Conference.