Light-Net: Lightweight Object Detector

Currently, object detectors based on CNN, such as RetinaNet, Faster-RCNN, CornerNet series, can achieve good performance, but have some common drawbacks, like large calculation cost, high model complexity and slow detection speed. In this paper, a new lightweight object detector is proposed, which adopted a density-based approach to merge the real boxes. To reduce calculation cost and improve detection speed, the tactic of multi-scale output is adopted to predict objects of different sizes with features of different scales. Furthermore, a new lightweight network model is proposed, which can show better performance in computation, FPS, and model complexity. Meanwhile, the separation of convolution is used to improve the basic convolution layer, which can achieve better results under the same number of filters. In the experiments, we verified the capability of our methods based on ablation experiment and model evaluation, which demonstrates the superiority of our method. Moreover, we have also conducted deep network and multichannel experiments on MS-COCO2014 datasets and achieved 20.9% mAP performance.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[4]  Larry S. Davis,et al.  SaccadeNet: A Fast and Accurate Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[8]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  Fei Wang,et al.  CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Tao Sheng,et al.  CFENet: An Accurate and Efficient Single-Shot Object Detector for Autonomous Driving , 2018, ArXiv.

[13]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[16]  Gang Yu,et al.  ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[18]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Wei Liu,et al.  Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting , 2018, ECCV.

[20]  Xingyi Zhou,et al.  Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Andrew P. Witkin,et al.  Uniqueness of the Gaussian Kernel for Scale-Space Filtering , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Guosheng Lin,et al.  Exploring Bottom-Up and Top-Down Cues With Attentive Learning for Webly Supervised Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[27]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[28]  Luc Van Gool,et al.  Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[31]  Hao Chen,et al.  NAS-FCOS: Fast Neural Architecture Search for Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Zhaohui Zheng,et al.  Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression , 2019, AAAI.

[33]  Takeo Kanade,et al.  A System for Video Surveillance and Monitoring , 2000 .

[34]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Yan Wang,et al.  Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[41]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[43]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[44]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Yi Yang,et al.  DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[49]  Liu Yong-liang Semi-automatic Road Extraction with Meanshift Algorithm , 2010 .

[50]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).