Partitioning DSP applications to different granularity reconfigurable hardware

In this paper, we propose an automated partitioning methodology between the fine and coarse-grain reconfigurable hardware for improving performance. The fine-grain logic is implemented by an embedded FPGA unit, while for the coarse-grain reconfigurable hardware, a 2-dimensional array of processing elements is considered. These different granularity reconfigurable functional units are embedded in a hybrid platform. The proposed methodology mainly consists of three steps, the kernel identification, the mapping onto the coarse-grain reconfigurable array, and the mapping onto the fine-grain reconfigurable hardware. The experiments for five real-world applications show that the speedup, relative to an all-FPGA solution, ranges from 1.4 to 3.9 for the considered applications.