Performance Comparison of Object Detection Algorithms with different Feature Extractors

In this work, speed vs accuracy of different Neural Network architectures using alternate feature extractors in the field of Object Detection is being computed, thereby finding the fastest and most accurate architecture out of the lot in order to carry out Object Detection. We made use of three architectures and three extractors to build different combinations of models in order to compute mAP, which is the metric used or commenting upon accuracy. COCO data-set has been used to extract sample images and the work is implemented on TensorFlow library.

[1]  Li Fei-Fei,et al.  DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xiaogang Wang,et al.  T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[5]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Pierre Baldi,et al.  Learning Activation Functions to Improve Deep Neural Networks , 2014, ICLR.

[7]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[8]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[9]  Eric Tzeng,et al.  Human Curation and Convnets: Powering Item-to-Item Recommendations on Pinterest , 2015, ArXiv.

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.