Towards an Efficient Accelerator for DNN-Based Remote Sensing Image Segmentation on FPGAs

Among popular techniques in remote sensing image (RSI) segmentation, Deep Neural Networks (DNNs) have gained increasing interest but often require high computation complexity, which largely limits their applicability in on-board space platforms. Therefore, various dedicated hardware designs on FPGAs have been developed to accelerate DNNs. However, it imposes difficulty on the design of efficient accelerators for DNN-based segmentation algorithms, since they need to perform both convolution and deconvolution which are two fundamentally different types of operations. This paper proposes a uniform architecture to efficiently implement both convolution and deconvolution in one vector multiplication module. This architecture is further optimized through exploiting different levels of parallelism and layer fusion to achieve low latency for RSI segmentation tasks. Moreover, an optimized DNN model is developed for real-time RSI segmentation, which shows preferable accuracy compared to other methods. The proposed hardware accelerator efficiently implements the DNN model on Intel's Arria 10 device, demonstrating 1578 GOPS of throughput and 17.4 ms of latency, i.e., 57 images per second.

[1]  Francesco Visin,et al.  A guide to convolution arithmetic for deep learning , 2016, ArXiv.

[2]  Andrew C. Ling,et al.  An OpenCL(TM) Deep Learning Accelerator on Arria 10 , 2017 .

[3]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Wayne Luk,et al.  Memory-Efficient Architecture for Accelerating Generative Networks on FPGA , 2018, 2018 International Conference on Field-Programmable Technology (FPT).

[5]  Wayne Luk,et al.  Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms , 2017, ARC.

[6]  Ye Zhang,et al.  Large Scale Remote Sensing Image Segmentation Based on Fuzzy Region Competition and Gaussian Mixture Model , 2018, IEEE Access.

[7]  Shengen Yan,et al.  Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[8]  Yu Wang,et al.  Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Eric S. Chung,et al.  A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[10]  Leibo Liu,et al.  GNA: Reconfigurable and Efficient Architecture for Generative Network Acceleration , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Xun Wang,et al.  A comparative study of deformable contour methods on medical image segmentation , 2008, Image Vis. Comput..

[12]  Yu Cao,et al.  Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.

[13]  Chokri Ben Amar,et al.  Deep learning for semantic segmentation of remote sensing images with rich spectral content , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[14]  Qi Wen,et al.  Quantifying Disaster Physical Damage Using Remote Sensing Data—A Technical Work Flow and Case Study of the 2014 Ludian Earthquake in China , 2017, International Journal of Disaster Risk Science.

[15]  Ning Ma,et al.  A Scalable Dataflow Accelerator for Real Time Onboard Hyperspectral Image Classification , 2016, ARC.

[16]  Wayne Luk,et al.  Optimizing CNN-based Segmentation with Deeply Customized Convolutional and Deconvolutional Architectures on FPGA , 2018, ACM Trans. Reconfigurable Technol. Syst..

[17]  Wayne Luk,et al.  Optimizing CNN-Based Hyperspectral Image Classification on FPGAs , 2019, ARC.

[18]  Manoj Alwani,et al.  Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  J. Olliver Datums and Map Projections for remote sensing, GIS and surveying , 2001 .

[21]  L. F. Curtis,et al.  Introduction to Environmental Remote Sensing. , 1978 .

[22]  Viktor Prasanna,et al.  Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System , 2017, FPGA.

[23]  Christos-Savvas Bouganis,et al.  An Unbiased MCMC FPGA-Based Accelerator in the Land of Custom Precision Arithmetic , 2017, IEEE Transactions on Computers.

[24]  Christos-Savvas Bouganis,et al.  Communication-Aware MCMC Method for Big Data Applications on FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[25]  Nam Sung Kim,et al.  FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).