Accelerating 3D CNN-based Lung Nodule Segmentation on a Multi-FPGA System

Lung nodule segmentation is one of the most significant steps in many Computer Aided Detection (CAD) systems used for lung nodule identification and classification. Three-dimensional convolutional neural networks (3D CNNs) have become a promising method in lung nodule segmentation, as this method can achieve higher detection accuracy than conventional methods. It has been proven that FPGAs can provide the most energy-efficient solution for CNN acceleration. However, the high computational complexity and memory requirements of 3D CNNs make it challenging to accelerate 3D CNNs on a single FPGA, as this will further bottleneck the performance of a 3D CNN-based CAD system. Accordingly, in this work, we focus on accelerating the 3D CNN-based lung nodule segmentation on a multi-FPGA platform by proposing an efficient mapping scheme that takes advantage of the massive parallelism provided by the platform, as well as maximizing the computational efficiency of the accelerators. Experimental results show that our system is able to achieve high computational efficiency and thereby a state-of-the-art performance of 14.5 TOPS at 200 MHz. Comparisons with CPU and GPU solutions demonstrate that our system achieves a 29.4x performance gain over CPU and a 10.5x energy efficiency improvement over GPU.

[1]  Jason Cong,et al.  Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[2]  Hao Chen,et al.  Multilevel Contextual 3-D CNNs for False Positive Reduction in Pulmonary Nodule Detection , 2017, IEEE Transactions on Biomedical Engineering.

[3]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[4]  Shmuel Winograd,et al.  On Multiplication of Polynomials Modulo a Polynomial , 1980, SIAM J. Comput..

[5]  Temesguen Messay,et al.  A new computationally efficient CAD system for pulmonary nodule detection in CT imagery , 2010, Medical Image Anal..

[6]  Zelong Wang,et al.  Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA , 2018, FPGA.

[7]  Behrouz A. Forouzan,et al.  TCP / IP Protocol Suite Edisi 2 , 2012 .

[8]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[9]  Jason Cong,et al.  Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster , 2016, ISLPED.

[10]  Abdul Kadir,et al.  Leaf Identification Using Fourier Descriptors and Other Shape Features , 2015, CVPR 2015.

[11]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[12]  Vivek Vaidya,et al.  Lung nodule detection in CT using 3D convolutional neural networks , 2017, 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017).

[13]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jan Cornelis,et al.  A novel computer-aided lung nodule detection system for CT images. , 2011, Medical physics.

[15]  Andrew C. Ling,et al.  An OpenCL™ Deep Learning Accelerator on Arria 10 , 2017, FPGA.

[16]  Soheil Ghiasi,et al.  Design space exploration of FPGA-based Deep Convolutional Neural Networks , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).