Floating point acceleration for stream processing applications in dynamically reconfigurable processors

Runtime reconfigurable processors provide a large degree of flexibility that allows them to dynamically adapt to different applications and requirements. They couple a standard processor with a runtime reconfigurable fabric (like an embedded FPGA) to offload computationally intensive kernels. In this paper we present the design and architecture of a flexible accelerator for floating point operations in stream processing applications. To integrate it in an existing reconfigurable processor, the different frequencies between the sequential processor (high frequency) and parallel accelerators (low frequencies) have to be managed. The results show 63.70× and 3.85× better performance-per-area efficiency when using our accelerator and the reconfigurable processor compared to the baseline processor with a soft-float implementation and a high-performance floating point unit, respectively.

[1]  Jorg Henkel,et al.  i-Core: A run-time adaptive processor for embedded multi-core systems , 2011 .

[2]  Muhammad Shafique,et al.  PATS: A Performance Aware Task Scheduler for Runtime Reconfigurable Processors , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[3]  Jörg Henkel,et al.  Invasive manycore architectures , 2012, 17th Asia and South Pacific Design Automation Conference.

[4]  Jörg Henkel,et al.  COREFAB: Concurrent reconfigurable fabric utilization in heterogeneous multi-core systems , 2014, 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).