DenseLightNet: A Light-Weight Vehicle Detection Network for Autonomous Driving

In recent years, vehicle detectors built on deep convolutional neural network (DCNN) have been widely used in autonomous driving. Under the complex conditions of road traffic, the detector is expected to run in high speed and high accuracy. However, due to the limited computing power and storage space on the autonomous vehicle, the deployment of advanced DCNN detectors is often restricted. The design of lightweight and powerful detectors is in a great desire. Recently, group convolution, as a novel convolution algorithm, has been proposed to reduce the floating-point operations and make the detection network lighter and faster. However, in practice, it is found that the increase of group number does not always boost the detection speed, but sometimes leads to the performance degradation. In addition, the existing guidelines for network design do not indicate how to choose the group number in the group convolution in order to maximize the overall detection speed. To this end, in this paper, we propose three new guidelines to determine the valid range of the group number, and design a lightweight detection network—DenseLightNet—based on these new design criteria. The proposed detector runs at a speed of three times faster than the current real-time detector YoloV3, while holding a much smaller model size.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Mohammad Rastegari,et al.  ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Linda G. Shapiro,et al.  ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[5]  Jun Wang,et al.  Surrounding Vehicle Detection Using an FPGA Panoramic Camera and Deep CNNs , 2020, IEEE Transactions on Intelligent Transportation Systems.

[6]  Jingdong Wang,et al.  Interleaved Group Convolutions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Song Wang,et al.  Improved Deep Hashing With Soft Pairwise Similarity for Multi-Label Image Retrieval , 2018, IEEE Transactions on Multimedia.

[8]  Ruimao Zhang,et al.  Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[10]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[12]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ligeng Zhu,et al.  Small Object Sensitive Segmentation of Urban Street Scene With Spatial Adjacency Between Object Classes , 2019, IEEE Transactions on Image Processing.

[15]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[16]  Qingquan Li,et al.  Turn Signal Detection During Nighttime by CNN Detector and Perceptual Hashing Tracking , 2017, IEEE Transactions on Intelligent Transportation Systems.

[17]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Samy Bengio,et al.  Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[19]  Kilian Q. Weinberger,et al.  CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Dong Liu,et al.  IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks , 2018, BMVC.

[21]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Jiayi Ma,et al.  Multi-Temporal Ultra Dense Memory Network for Video Super-Resolution , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Long Chen,et al.  Deep Integration: A Multi-Label Architecture for Road Scene Recognition , 2019, IEEE Transactions on Image Processing.

[25]  Long Chen,et al.  Robust Lane Detection From Continuous Driving Scenes Using Deep Neural Networks , 2019, IEEE Transactions on Vehicular Technology.

[26]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Song Wang,et al.  Degraded Image Semantic Segmentation With Dense-Gram Networks , 2020, IEEE Transactions on Image Processing.

[29]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Jianhuang Lai,et al.  Interleaved Structured Sparse Convolutional Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Linda G. Shapiro,et al.  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[32]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Long Chen,et al.  Moving-Object Detection From Consecutive Stereo Pairs Using Slanted Plane Smoothing , 2017, IEEE Transactions on Intelligent Transportation Systems.

[34]  Qian Wang,et al.  DeepCrack: Learning Hierarchical Convolutional Features for Crack Detection , 2019, IEEE Transactions on Image Processing.

[35]  Qingquan Li,et al.  A Sensor-Fusion Drivable-Region and Lane-Detection System for Autonomous Vehicle Navigation in Challenging Road Scenarios , 2014, IEEE Transactions on Vehicular Technology.

[36]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[37]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.