A Full HD 60 fps CNN Super Resolution Processor with Selective Caching based Layer Fusion for Mobile Devices

A high-throughput CNN super resolution (SR) processor is proposed for memory efficient SR processing. It has three key features: 1) selective caching based layer fusion to minimize external memory access (EMA), 2) memory compaction scheme for smaller on-chip memory footprint, and 3) cyclic ring core architecture to increase the throughput with improved core utilization. As a result, the implemented processor achieves 60 frames-per-second throughput in generating full HD images.

[1]  Suk-Ju Kang,et al.  An Energy-Efficient FPGA-Based Deconvolutional Neural Networks Accelerator for Single Image Super-Resolution , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Manoj Alwani,et al.  Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xiaoou Tang,et al.  Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.