A semi-automatic toolchain for reconfigurable multiprocessor systems-on-chip: architecture development and application partitioning (abstract only)

The efficient automatized application partitioning and mapping process for multiprocessor systems is a challenging task in academics as well as in industry until today. The introduction of reconfigurable hardware in this domain helps to meet the application requirements more efficiently due to the method of hardware adaptation at design and runtime. The combination of multiprocessor systems-on-chip (MPSoC) and reconfigurable hardware results in the RAMPSoC approach (Runtime Adaptive MPSoC). A RAMPSoC consists of an adaptive network of processors and hardware accelerators. This novel degree of freedom in MPSoC technology and the resulting design space has to be processed by a suitable toolchain, which helps to hide the complexity of the hardware architecture and its realization alternatives from the developer. This work investigates in an approach for a semi-automatic toolchain for the development of the hardware architecture and the application partitioning and mapping. A multistep approach is used to analyze and partition the software application. First each function of the software application is profiled and the communication overhead between the functions is analyzed. The results, obtained from the profiling and the communication analysis, are used as parameters for the cost function of a hierarchical clustering algorithm. The functions are clustered into multiple application modules for a given number of processors. In a second step, for each processor the corresponding application module is analyzed concerning computation intensive blocks or loops. Out of this results a Hardware/Software Co-design partitioning with a suggestion for possible hardware accelerators for each of the processors. This allows to achieve a performance near to the maximum of the local processors and therefore in general for the MPSoC. The third and last step in the designflow handles the generation of the bitstream for the complete system together with the software executables for each of the processors. This semi-automatic toolchain has been evaluated using an image processing algorithm.