Deep SCNN-Based Real-Time Object Detection for Self-Driving Vehicles Using LiDAR Temporal Data

Real-time accurate detection of three-dimensional (3D) objects is a fundamental necessity for self-driving vehicles. Most existing computer vision approaches are based on convolutional neural networks (CNNs). Although the CNN-based approaches can achieve high detection accuracy, their high energy consumption is a severe drawback. To resolve this problem, novel energy efficient approaches should be explored. Spiking neural network (SNN) is a promising candidate because it has orders-of-magnitude lower energy consumption than CNN. Unfortunately, the studying of SNN has been limited in small networks only. The application of SNN for large 3D object detection networks has remain largely open. In this paper, we integrate spiking convolutional neural network (SCNN) with temporal coding into the YOLOv2 architecture for real-time object detection. To take the advantage of spiking signals, we develop a novel data preprocessing layer that translates 3D point-cloud data into spike time data. We propose an analog circuit to implement the non-leaky integrate and fire neuron used in our SCNN, from which the energy consumption of each spike is estimated. Moreover, we present a method to calculate the network sparsity and the energy consumption of the overall network. Extensive experiments have been conducted based on the KITTI dataset, which show that the proposed network can reach competitive detection accuracy as existing approaches, yet with much lower average energy consumption. If implemented in dedicated hardware, our network could have a mean sparsity of 56.24% and extremely low total energy consumption of 0.247mJ only. Implemented in NVIDIA GTX 1080i GPU, we can achieve 35.7 fps frame rate, high enough for real-time object detection.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Jiaolong Xu,et al.  Multiview random forest of local experts combining RGB and LIDAR data for pedestrian detection , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[4]  S. Thorpe,et al.  STDP-based spiking deep convolutional neural networks for object recognition , 2018 .

[5]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Hossein Valavi,et al.  A Mixed-Signal Binarized Convolutional-Neural-Network Accelerator Integrating Dense Weight Storage and Multiplication for Reduced Data Movement , 2018, 2018 IEEE Symposium on VLSI Circuits.

[7]  Horst-Michael Groß,et al.  Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Walter Senn,et al.  Fast and deep neuromorphic learning with time-to-first-spike coding , 2019, ArXiv.

[11]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Jyrki Alakuijala,et al.  Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Hesham Mostafa,et al.  Supervised Learning Based on Temporal Coding in Spiking Neural Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[16]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.

[17]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Filip Ponulak,et al.  Introduction to spiking neural networks: Information processing, learning and applications. , 2011, Acta neurobiologiae experimentalis.

[20]  Anantha Chandrakasan,et al.  Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[21]  Bin Dai,et al.  SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud , 2019, IEEE Access.

[22]  Cristiano Premebida,et al.  Pedestrian detection combining RGB and dense LIDAR data , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Ying Chen,et al.  Direct training based spiking convolutional neural networks for object recognition , 2019, ArXiv.

[24]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ming C. Wu,et al.  Lidar System Architectures and Circuits , 2017, IEEE Communications Magazine.

[26]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[29]  Krzysztof Czarnecki,et al.  MLOD: A multi-view 3D object detection based on robust feature fusion method , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[30]  Saeed Reza Kheradpisheh,et al.  S4NN: temporal backpropagation for spiking neural networks with one spike per neuron , 2020, Int. J. Neural Syst..

[31]  Timothée Masquelier,et al.  Deep Learning in Spiking Neural Networks , 2018, Neural Networks.

[32]  Xaq Pitkow,et al.  Skip Connections Eliminate Singularities , 2017, ICLR.