On the use of deep neural networks for the detection of small vehicles in ortho-images

This paper addresses the question of the detection of small targets (vehicles) in ortho-images. This question differs from the general task of detecting objects in images by several aspects. First, the vehicles to be detected are small, typically smaller than 20×20 pixels. Second, due to the multifarious-ness of the landscapes of the earth, several pixel structures similar to that of a vehicle might emerge (roof tops, shadow patterns, rocks, buildings), whereas within the vehicle class the inter-class variability is limited as they all look alike from afar. Finally, the imbalance between the vehicles and the rest of the picture is enormous in most cases. Specifically, this paper is focused on the detection tasks introduced by the VEDAI dataset [1]. This work supports an extensive study of the problems one might face when applying deep neural networks with low resolution and scarce data and proposes some solutions. One of the contributions of this paper is a network severely outperforming the state-of-the-art while being much simpler to implement and a lot faster than competitive approaches. We also list the limitations of this approach and provide several new ideas to further improve our results.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[3]  Dumitru Erhan,et al.  Scalable, High-Quality Object Detection , 2014, ArXiv.

[4]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[8]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[9]  Bertrand Le Saux,et al.  On the usability of deep networks for object-based image analysis , 2016, ArXiv.

[10]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  C. N. Savithri,et al.  Vehicle Detection From Aerial Imagery , 2013 .

[14]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[15]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[17]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[18]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[19]  Frédéric Jurie,et al.  Vehicle detection in aerial imagery : A small target detection benchmark , 2016, J. Vis. Commun. Image Represent..

[20]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Parallel Deep Convolutional Neural Networks , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[22]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[23]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[24]  Horst Bischof,et al.  Recognizing cars in aerial imagery to improve orthophotos , 2007, GIS.

[25]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[26]  Ramakant Nevatia,et al.  Car detection in low resolution aerial images , 2003, Image Vis. Comput..

[27]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[28]  Sébastien Razakarivony,et al.  Apprentissage de variétés pour la Détection et Reconnaissance de véhicules faiblement résolus en imagerie aérienne , 2014 .

[29]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Pascal Fua,et al.  A Real-Time Deformable Detector , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[32]  Wesam A. Sakla,et al.  A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning , 2016, ECCV.