CNN Accelerator with Non-Blocking Network Design

In this paper, we designed a new hardware architecture that uses non-blocking network for accelerating the convolutional neural network (CNN). Unlike many other CNN accelerator which only capable of supporting a specific type of network model, by making use of the rearrangeability of non-blocking network, we can provide high flexibility and high parallelism. We successfully implemented our CNN accelerator on Xilinx Virtex UltraScale+ FPGA VCU128 Evaluation Kit and evaluated it by running CNN model, LeNet-5.