Deep Learning Approaches for Detecting Objects from Images: A Review

Detecting objects from images is a challenging problem in the domain of computer vision and plays a very crucial role for wide range of real-time applications. The ever-increasing growth of deep learning due to availability of large training data and powerful GPUs helped computer vision community to build commercial products and services which were not possible a decade ago. Deep learning architectures especially convolutional neural networks have achieved state-of-the-art performance on worldwide competitions for visual recognition like ILSVRC, PASCAL VOC. Deep learning techniques alleviate the need of human expertise from designing the handcrafted features and automatically learn the features. This resulted into use of deep architectures in many domains like computer vision (image classification, visual recognition) and natural language processing (language modeling, speech recognition). Object detection is one such promising area immensely needed to be used in automated applications like self-driving cars, robotics, drone image analysis. This paper analytically reviews state-of-the-art deep learning techniques based on convolutional neural networks for object detection.

[1]  Zhihai He,et al.  Task-Driven Progressive Part Localization for Fine-Grained Object Recognition , 2016, IEEE Transactions on Multimedia.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[5]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Neural Networks , 2013 .

[6]  Xiaogang Wang,et al.  DeepID-Net: Deformable deep convolutional neural networks for object detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Rama Chellappa,et al.  Deep Multitask Learning for Railway Track Inspection , 2015, IEEE Transactions on Intelligent Transportation Systems.

[8]  Sanja Fidler,et al.  segDeepM: Exploiting segmentation and context in deep neural networks for object detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sven J. Dickinson,et al.  Object Categorization: Computer and Human Vision Perspectives , 2009 .

[10]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Ming Yang,et al.  Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Pietro Perona,et al.  Visual Recognition Circa 2008 , 2009 .

[14]  Mohan M. Trivedi,et al.  Multi-scale volumes for deep object detection and localization , 2017, Pattern Recognit..

[15]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[16]  Xiaogang Wang,et al.  DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection , 2014, ArXiv.

[17]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[20]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.