Using Speculative Functional Units in high level synthesis

Speculative Functional Units (SFUs) enable a new execution paradigm for High Level Synthesis (HLS). SFUs are arithmetic functional units that operate using a predictor for the carry signal, which reduces the critical path delay. The performance of these units is determined by the success in the prediction of the carry value, i.e. the hit rate of the prediction. Hence SFUs reduce critical path at a low cost, but they cannot be used in HLS with the current techniques. In order to use them, it is necessary to include hardware support to recover from mispredictions of the carry signals. In this paper, we present techniques for designing a datapath controller for seamless deployment of SFUs in HLS. We have developed two techniques for this goal. The first approach stops the execution of the entire datapath for each misprediction and resumes execution once the correct value of the carry is known. The second approach decouples the functional unit suffering from the misprediction from the rest of the datapath. Hence, it allows the rest of the SFUs to carry on execution and be at different scheduling states at different times. Experiments show that it is possible to reduce execution time by as much as 38% and by 33% on average.

[1]  Silvia M. Müller,et al.  On the scheduling of variable latency functional units , 1999, SPAA '99.

[2]  Oliver R. Hinton,et al.  Adder methodology and design using probabilistic multiple carry estimates , 2005 .

[3]  Fernando Gustavo Tinetti,et al.  Computer Architecture: A Quantitative Approach J. L. Hennessy, D. A. Patterson Morgan Kaufman, 4th Edition, 2007 , 2008 .

[4]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[6]  Chih-Chieh Lee,et al.  Correlation and Aliasing in Dynamic Branch Predictors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[7]  Srivaths Ravi,et al.  Integrating variable-latency components into high-level synthesis , 2000, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Kai Wang,et al.  Highly accurate data value prediction , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[9]  Román Hermida,et al.  Bitwise scheduling to balance the computational cost of behavioral specifications , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[10]  Trevor N. Mudge,et al.  Correlation and Aliasing in Dynamic Branch Predictors , 1996, ISCA.

[11]  Robert A. Walker,et al.  Introduction to the Scheduling Problem , 1995, IEEE Des. Test Comput..

[12]  Majid Sarrafzadeh,et al.  Low-power driven scheduling and binding , 1998, Proceedings of the 8th Great Lakes Symposium on VLSI (Cat. No.98TB100222).

[13]  富田 眞治 20世紀の名著名論:R. M. Tomasulo : An Efficient Algorithm for Exploiting Multiple Arithmetic Units , 2004 .

[14]  Kiyoung Choi,et al.  Low power high level synthesis by increasing data correlation , 1997, ISLPED '97.

[15]  Román Hermida,et al.  Applying speculation techniques to implement functional units , 2008, 2008 IEEE International Conference on Computer Design.

[16]  R. M. Tomasulo,et al.  An efficient algorithm for exploiting multiple arithmetic units , 1995 .

[17]  David W. Anderson,et al.  The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .

[18]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[19]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[20]  Kiyoung Choi,et al.  Performance-driven high-level synthesis with bit-level chaining andclock selection , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  Oliver R. Hinton,et al.  Probabilistic carry state estimate for improved asynchronous adder performance , 2001 .

[22]  Ching-Chuen Jong,et al.  A look-ahead synthesis technique with backtracking for switching activity reduction in low power high-level synthesis , 2007, Microelectron. J..

[23]  Michael J. Flynn,et al.  Computer Organization and Architecture , 1978, Advanced Course: Operating Systems.

[24]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .