A framework for effective exploitation of partial reconfiguration in dataflow computing

The exploitation of high-performance architectures based on reconfigurable hardware to build power efficient supercomputing clusters is becoming more and more common. Indeed, large speedups have already been demonstrated in several high-performance computing (HPC) applications. On the other hand, partial reconfiguration (PR) has the potential to further increase performance and power efficiency in many applications; however, there is currently very limited support for transforming a traditional design into a reconfigurable one. In this work, we introduce a design methodology for PR designs that combines application analysis, partitioning, mapping and scheduling, and supports fast exploration of various design options. These steps are integrated in an automated toolchain which allows a designer to implement reconfigurable designs with simple guidance through a graphical interface. We demonstrate our approach by applying the methodology to an image processing application, implementing the proposed design on a Maxeler MaxWorkstation.

[1]  Mingjie Lin,et al.  Performance Benefits of Monolithically Stacked 3-D FPGA , 2007, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[2]  Marco D. Santambrogio,et al.  Partial Dynamic Reconfiguration: The Caronte Approach. A New Degree of Freedom in the HW/SW Codesign , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[3]  Wayne Luk,et al.  HArtes: Hardware-Software Codesign for Heterogeneous Multicore Platforms , 2010, IEEE Micro.

[4]  Hartmut Schmeck,et al.  Ant colony optimization for resource-constrained project scheduling , 2000, IEEE Trans. Evol. Comput..

[5]  Marco D. Santambrogio,et al.  Design methodology for partial dynamic reconfiguration: a new degree of freedom in the HW/SW codesign , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[6]  M. Valero,et al.  An overview of selected hybrid and reconfigurable architectures , 2012, 2012 IEEE International Conference on Industrial Technology.

[7]  Eduard Ayguadé,et al.  Nanos mercurium: A research compiler for OpenMP , 2004 .

[8]  Toshitsugu Yuba,et al.  Dataflow Computing Models, Languages, and Machines for Intelligence Computations , 1988, IEEE Trans. Software Eng..

[9]  Jonathan Rose,et al.  The effect of LUT and cluster size on deep-submicron FPGA performance and density , 2004 .

[10]  Abbes Amira,et al.  A new FPGA-based dynamic partial reconfiguration design flow and environment for image processing applications , 2010, 2010 2nd European Workshop on Visual Information Processing (EUVIP).

[11]  Patrick Degenaar,et al.  Parallelism to reduce power consumption on FPGA spatiotemporal image processing , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[12]  Pier Luca Lanzi,et al.  Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.