Analytical Delay Model for CPU-FPGA Data Paths in Programmable System-on-Chip FPGA

The CPU hard cores in programmable System-on-Chips SoC often communicate with the soft IP cores in reconfigurable fabric through some dedicated ports. The various data paths corresponding to different ports have different performance characterizations which make them suitable for various applications. This article studies the analytical performance model for transferring data stored in CPU side to FPGA side and vice versa through all different communication ports and data paths available in a typical programmable SoC. The proposed methodology for extracting the cycle accurate delay models is applicable to other similar programmable SoCs. Evaluation experiments identified that the error rate of proposed models are within an acceptable rate of 5i¾?%.

[1]  Valery Sklyarov,et al.  Comparison of On-chip Communications in Zynq-7000 All Programmable Systems-on-Chip , 2015, IEEE Embedded Systems Letters.

[2]  Thomas Eisenbarth,et al.  Guest Editorial: Special Section on Embedded System Security , 2015, IEEE Embed. Syst. Lett..

[3]  Valery Sklyarov,et al.  High-performance implementation of regular and easily scalable sorting networks on an FPGA , 2014, Microprocess. Microsystems.

[4]  Luca Benini,et al.  Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ , 2013 .

[5]  Heiner Giefers,et al.  Accelerating arithmetic kernels with coherent attached FPGA coprocessors , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Onur Mutlu,et al.  Prefetch-Aware DRAM Controllers , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[7]  Miaoqing Huang,et al.  Improve memory access for achieving both performance and energy efficiencies on heterogeneous systems , 2014, 2014 International Conference on Field-Programmable Technology (FPT).

[8]  Sébastien Lafond,et al.  Interrupt Costs in Embedded System with Short Latency Hardware Accelerators , 2008, 15th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems (ecbs 2008).

[9]  Fang Liu,et al.  Studying the impact of hardware prefetching and bandwidth partitioning in chip-multiprocessors , 2011, SIGMETRICS '11.

[10]  Won-Taek Lim,et al.  Effective Management of DRAM Bandwidth in Multicore Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[11]  Luca Benini,et al.  An Evaluation of Memory Sharing Performance for Heterogeneous Embedded SoCs with Many-Core Accelerators , 2015, COSMIC@CGO.