论文信息 - Object Detection with Convolutional Neural Networks

Object Detection with Convolutional Neural Networks

In this chapter, we present a brief overview of the recent development in object detection using convolutional neural networks (CNN). Several classical CNN-based detectors are presented. Some developments are based on the detector architectures, while others are focused on solving certain problems, like model degradation and small-scale object detection. The chapter also presents some performance comparison results of different models on several benchmark datasets. Through the discussion of these models, we hope to give readers a general idea about the developments of CNN-based object detection.

[1] Derek C. Rose,et al. Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[2] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[3] Guanghui Wang,et al. BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Guanghui Wang,et al. MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[5] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[6] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7] Nuno Vasconcelos,et al. Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8] Xiaogang Wang,et al. T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[13] Qinghua Hu,et al. Vision Meets Drones: A Challenge , 2018, ArXiv.

[14] Ramakant Nevatia,et al. Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15] Ying Chen,et al. M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network , 2018, AAAI.

[16] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[17] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[18] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20] Larry S. Davis,et al. An Analysis of Scale Invariance in Object Detection - SNIP , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Guanghui Wang,et al. Dictionary Representation of Deep Features for Occlusion-Robust Face Recognition , 2019, IEEE Access.

[24] Ming-Ai Li,et al. A novel feature extraction method for scene recognition based on Centered Convolutional Restricted Boltzmann Machines , 2015, Neurocomputing.

[25] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[26] Wei Zhang,et al. VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[27] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[29] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[30] Guanghui Wang,et al. Real-Time Obstacle Detection and Tracking for Sense-and-Avoid Mechanism in UAVs , 2018, IEEE Transactions on Intelligent Vehicles.

[31] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[32] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[33] Guanghui Wang,et al. Toward Learning a Unified Many-to-Many Mapping for Diverse Image Translation , 2019, Pattern Recognit..

[34] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[35] Junjie Yan,et al. Quantization Mimic: Towards Very Tiny CNN for Object Detection , 2018, ECCV.

[36] Matti Pietikäinen,et al. Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.

[37] Silvio Savarese,et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Guanghui Wang,et al. Adversarially Approximated Autoencoder for Image Generation and Manipulation , 2019, IEEE Transactions on Multimedia.

[39] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[43] Guanghui Wang,et al. Fast and Robust Object Tracking with Adaptive Detection , 2016, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).

[44] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[45] Shifeng Zhang,et al. Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46] Shuicheng Yan,et al. Dual Path Networks , 2017, NIPS.

[47] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[48] Bingbing Ni,et al. Scale-Transferrable Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49] Taghi M. Khoshgoftaar,et al. Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[50] Guanghui Wang,et al. Vision-Based Real-Time Aerial Object Localization and Tracking for UAV Sensing System , 2017, IEEE Access.

[51] Quan Wang,et al. An Efficient Approach for Polyps Detection in Endoscopic Videos Based on Faster R-CNN , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[52] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Zhanyi Hu,et al. Learning Depth From Single Images With Deep Neural Network Embedding Focal Length , 2018, IEEE Transactions on Image Processing.

[54] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Deng Cai,et al. Deep feature based contextual model for object detection , 2016, Neurocomputing.

[56] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[57] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[58] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59] Ahmed M. Elgammal,et al. CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms , 2017, ICCC.

[60] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[61] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[63] Wei Liu,et al. DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[64] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.