High-level synthesis promises a significant shortening of the FPGA design cycle when compared with design entry using register transfer level (RTL) languages. Recent evaluations report that C-to-RTL flows can produce results with a quality close to hand-crafted designs [1]. Algorithms which use dynamic, pointer-based data structures, which are common in software, remain difficult to implement well. In this paper, we describe a comparative case study using Xilinx Vivado HLS as an exemplary state-of-the-art high-level synthesis tool. Our test cases are two alternative algorithms for the same compute-intensive machine learning technique (clustering) with significantly different computational properties. We compare a data-flow centric implementation to a recursive tree traversal implementation which incorporates complex data-dependent control flow and makes use of pointer-linked data structures and dynamic memory allocation. The outcome of this case study is twofold: We confirm similar performance between the hand-written and automatically generated RTL designs for the first test case. The second case reveals a degradation in latency by a factor greater than 30× if the source code is not altered prior to high-level synthesis. We identify the reasons for this shortcoming and present code transformations that narrow the performance gap to a factor of four. We generalise our source-to-source transformations whose automation motivates research directions to improve high-level synthesis of dynamic data structures in the future.
[1]
George A. Constantinides,et al.
FPGA-based K-means clustering using tree-based data structures
,
2013,
2013 23rd International Conference on Field programmable Logic and Applications.
[2]
D.M. Mount,et al.
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
,
2002,
IEEE Trans. Pattern Anal. Mach. Intell..
[3]
Shashank Dabral,et al.
Lessons and Experiences with High-Level Synthesis
,
2009,
IEEE Design & Test of Computers.
[4]
A. Winship.
Interest.
,
1893
.
[5]
Dirk Stroobandt,et al.
An overview of today’s high-level synthesis tools
,
2012,
Design Automation for Embedded Systems.
[6]
Jason Helge Anderson,et al.
LegUp: high-level synthesis for FPGA-based processor/accelerator systems
,
2011,
FPGA '11.
[7]
Adrian Park,et al.
Designing Modular Hardware Accelerators in C with ROCCC 2.0
,
2010,
2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.