The Impact of Loop Unrolling on Controller Delay in High Level Synthesis

Loop unrolling is a well-known compiler optimization that can lead to significant performance improvements. When used in high level synthesis (HLS) unrolling can affect the controller complexity and delay. We study the effect of the loop unrolling factor on the delay of controllers generated during HLS. We propose a technique to predict controller delay as a function of the loop unrolling factor, and use this prediction with other search space pruning methods to automatically determine the optimal loop unrolling factor that results in a controller whose delay fits into a specified time budget, without an exhaustive exploration. Experimental results indicate delay predictions that are close to measured delays, yet significantly faster than exhaustive synthesis

[1]  Vivek Sarkar Optimized unrolling of nested loops , 2000, ICS '00.

[2]  Reinaldo A. Bergamaschi Bridging the domains of high-level and logic synthesis , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[3]  Vicki H. Allan,et al.  Software pipelining , 1995, CSUR.

[4]  Yoshiaki Fukazawa,et al.  A method for estimating optimal unrolling times for nested loops , 1997, Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN'97).

[5]  Andreas Kuehlmann,et al.  Timing analysis in high-level synthesis , 1992, ICCAD.

[6]  Mahmut T. Kandemir,et al.  Influence of compiler optimizations on system power , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[7]  Ryan Kastner,et al.  Factoring and eliminating common subexpressions in polynomial expressions , 2004, ICCAD 2004.

[8]  Rajesh Gupta,et al.  Loop shifting and compaction for the high-level synthesis of designs with complex control flow , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[9]  Gianluca Palermo,et al.  Using speculative computation and parallelizing techniques to improve scheduling of control based designs , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[10]  Ken Kennedy,et al.  Improving the ratio of memory operations to floating-point operations in loops , 1994, TOPL.

[11]  Preeti Ranjan Panda,et al.  Rapid estimation of control delay from high-level specifications , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[12]  Miodrag Potkonjak,et al.  Multiple constant multiplications: efficient and versatile framework and algorithms for exploring common subexpression elimination , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Kiyoung Choi,et al.  Performance-driven high-level synthesis with bit-level chaining andclock selection , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[14]  Bo-Kyung Choi,et al.  Achieving design closure through delay relaxation parameter , 2003, ICCAD-2003. International Conference on Computer Aided Design (IEEE Cat. No.03CH37486).

[15]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[16]  Román Hermida,et al.  Bitwise scheduling to balance the computational cost of behavioral specifications , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Rainer Leupers,et al.  Function inlining under code size constraints for embedded processors , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).

[18]  Nikil D. Dutt,et al.  Coordinated parallelizing compiler optimizations and high-level synthesis , 2004, TODE.