Evaluation of Deep Models for Real-Time Small Object Detection

Real-time object detection is crucial for many applications. Approaches based on Deep Learning have achieved state-of-the-art performance on challenging datasets. Although several evaluations of the models have been conducted, there is no extensive evaluation with specific focuses on real-time small object detection. In this work, we present an in-depth evaluation of existing deep learning models in detecting small objects. We evaluate three state-of-the-art models including You Only Look Once (YOLO), Single Shot MultiBox Detector (SSD), and Faster R-CNN with related trade-off factors i.e. accuracy, execution time and resource constraints. Experiments were conducted on benchmark datasets and a newly generated dataset for small object detection. All analyses and findings are then presented.

[1]  Larry S. Davis,et al.  Composite Discriminant Factor analysis , 2014, IEEE Winter Conference on Applications of Computer Vision.

[2]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[6]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[7]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[8]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jianxiong Xiao,et al.  R-CNN for Small Object Detection , 2016, ACCV.

[11]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Baoli Li,et al.  Traffic-Sign Detection and Classification in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .