Assessment of Object Detection Using Deep Convolutional Neural Networks

Detecting the objects from images and videos has always been the point of active research area for the applications of computer vision and artificial intelligence namely robotics, self-driving cars, automated video surveillance, crowd management, home automation and manufacturing industries, activity recognition systems, medical imaging, and biometrics. The recent years witnessed the boom of deep learning technology for its effective performance on image classification and detection challenges in visual recognition competitions like PASCAL VOC, Microsoft COCO, and ImageNet. Deep convolutional neural networks have provided promising results for object detection by alleviating the need for human expertise for manually handcrafting the features for extraction. It allows the model to learn automatically by letting the neural network to be trained on large-scale image data using powerful and robust GPUs in a parallel way, thus, reducing training time. This paper aims to highlight the state-of-the-art approaches based on the deep convolutional neural networks especially designed for object detection from images.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Horst-Michael Groß,et al.  Cooperative multi-scale Convolutional Neural Networks for person detection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[7]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Nabendu Chaki,et al.  Moving Object Detection Approaches, Challenges and Object Tracking , 2014 .

[10]  Gong Cheng,et al.  RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Neural Networks , 2013 .

[12]  Xiaogang Wang,et al.  DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection , 2014, ArXiv.

[13]  Mohan M. Trivedi,et al.  Multi-scale volumes for deep object detection and localization , 2017, Pattern Recognit..

[14]  Yoshua Bengio,et al.  Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.

[15]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Phill-Kyu Rhee,et al.  Efficient object detection using convolutional neural network-based hierarchical feature modeling , 2016, Signal Image Video Process..

[18]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[19]  Pascal Vincent,et al.  Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.

[20]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[21]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[22]  Xiaogang Wang,et al.  DeepID-Net: Deformable deep convolutional neural networks for object detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.