Toward Fast and Accurate Vehicle Detection in Aerial Images Using Coupled Region-Based Convolutional Neural Networks

Vehicle detection in aerial images, being an interesting but challenging problem, plays an important role for a wide range of applications. Traditional methods are based on sliding-window search and handcrafted or shallow-learning-based features with heavy computational costs and limited representation power. Recently, deep learning algorithms, especially region-based convolutional neural networks (R-CNNs), have achieved state-of-the-art detection performance in computer vision. However, several challenges limit the applications of R-CNNs in vehicle detection from aerial images: 1) vehicles in large-scale aerial images are relatively small in size, and R-CNNs have poor localization performance with small objects; 2) R-CNNs are particularly designed for detecting the bounding box of the targets without extracting attributes; 3) manual annotation is generally expensive and the available manual annotation of vehicles for training R-CNNs are not sufficient in number. To address these problems, this paper proposes a fast and accurate vehicle detection framework. On one hand, to accurately extract vehicle-like targets, we developed an accurate-vehicle-proposal-network (AVPN) based on hyper feature map which combines hierarchical feature maps that are more accurate for small object detection. On the other hand, we propose a coupled R-CNN method, which combines an AVPN and a vehicle attribute learning network to extract the vehicle's location and attributes simultaneously. For original large-scale aerial images with limited manual annotations, we use cropped image blocks for training with data augmentation to avoid overfitting. Comprehensive evaluations on the public Munich vehicle dataset and the collected vehicle dataset demonstrate the accuracy and effectiveness of the proposed method.

[1]  Hsu-Yung Cheng,et al.  Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks , 2012, IEEE Transactions on Image Processing.

[2]  Nikolaos Grammalidis,et al.  Building Detection Using Enhanced HOG–LBP Features and Region Refinement Processes , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Bo Du,et al.  Beyond the Sparsity-Based Target Detector: A Hybrid Sparsity and Statistics-Based Detector for Hyperspectral Images , 2016, IEEE Transactions on Image Processing.

[5]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[6]  Liujuan Cao,et al.  Vehicle Detection in High-Resolution Aerial Images via Sparse Representation and Superpixels , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Farid Melgani,et al.  A SIFT-SVM method for detecting cars in UAV images , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[8]  Libao Zhang,et al.  Airport Detection and Aircraft Recognition Based on Two-Layer Saliency Model in High Spatial Resolution Remote-Sensing Images , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[9]  Xintao Hu,et al.  Weakly supervised target detection in remote sensing images based on transferred deep features and negative bootstrapping , 2016, Multidimens. Syst. Signal Process..

[10]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Horst Bischof,et al.  A 3D Teacher for Car Detection in Aerial Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Liujuan Cao,et al.  Vehicle Detection in High-Resolution Aerial Images Based on Fast Sparse Representation Classification and Multiorder Feature , 2016, IEEE Transactions on Intelligent Transportation Systems.

[15]  Awais Ahmad,et al.  Real-Time Big Data Analytical Architecture for Remote Sensing Application , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[16]  Farid Melgani,et al.  Detecting Cars in UAV Images With a Catalog-Based Approach , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Yeongjae Cheon,et al.  PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection , 2016, ArXiv.

[18]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jie Liu,et al.  Car detection from high-resolution aerial imagery using multiple features , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[22]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[23]  Naoto Yokoya,et al.  Object Detection Based on Sparse Representation and Hough Voting for Optical Remote Sensing Imagery , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Yong Wang,et al.  A Novel Vehicle Detection Method With High Resolution Highway Aerial Image , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[25]  Xuelong Li,et al.  Scene Parsing From an MAP Perspective , 2015, IEEE Transactions on Cybernetics.

[26]  Bo Du,et al.  Weakly Supervised Learning Based on Coupled Convolutional Neural Networks for Aircraft Detection , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Xuelong Li,et al.  Semi-Supervised Multitask Learning for Scene Recognition , 2015, IEEE Transactions on Cybernetics.

[29]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[30]  Gellért Máttyus,et al.  Fast Multiclass Vehicle Detection on Aerial Images , 2015, IEEE Geoscience and Remote Sensing Letters.

[31]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Farid Melgani,et al.  Automatic Car Counting Method for Unmanned Aerial Vehicle Images , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[34]  Liangpei Zhang,et al.  A New Building Extraction Postprocessing Framework for High-Spatial-Resolution Remote-Sensing Imagery , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[35]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[36]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[38]  Igor Sevo,et al.  Convolutional Neural Network Based Automatic Object Detection on Aerial Images , 2016, IEEE Geoscience and Remote Sensing Letters.

[39]  Peter Reinartz,et al.  An Operational System for Estimating Road Traffic Information from Aerial Images , 2014, Remote. Sens..

[40]  Junwei Han,et al.  Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding , 2014 .

[41]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Wen Liu,et al.  Automated Vehicle Extraction and Speed Determination From QuickBird Satellite Images , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[43]  Uwe Stilla,et al.  Airborne Vehicle Detection in Dense Urban Areas Using HoG Features and Disparity Maps , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[44]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[45]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[46]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[47]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[48]  Xinwei Zheng,et al.  Efficient Saliency-Based Object Detection in Remote Sensing Images Using Deep Belief Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[49]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Doron E. Bar,et al.  Moving Car Detection and Spectral Restoration in a Single Satellite WorldView-2 Imagery , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.