Automatic Detection of Track and Fields in China from High-Resolution Satellite Images Using Multi-Scale-Fused Single Shot MultiBox Detector

Object detection is facing various challenges as an important aspect in the field of remote sensing—especially in large scenes due to the increase of satellite image resolution and the complexity of land covers. Because of the diversity of the appearance of track and fields, the complexity of the background and the variety between satellite images, even superior deep learning methods have difficulty extracting accurate characteristics of track and field from large complex scenes, such as the whole of China. Taking track and field as a study case, we propose a stable and accurate method for target detection. Firstly, we add the “deconvolution” and “concat” module to the structure of the original Single Shot MultiBox Detector (SSD), where Visual Geometry Group 16 (VGG16) is served as a basic network, followed by multiple convolution layers. The two modules are used to sample the high-level feature map and connect it with the low-level feature map to form a new network structure multi-scale-fused SSD (abbreviated as MSF_SSD). MSF-SSD can enrich the semantic information of the low-level feature, which is especially effective for small targets in large scenes. In addition, a large number of track and fields are collected as samples for the whole China and a series of parameters are designed to optimize the MSF_SSD network through the deep analysis of sample characteristics. Finally, by using MSF_SSD network, we achieve the rapid and automatic detection of meter-level track and fields in the country for the first time. The proposed MSF_SSD model achieves 97.9% mean average precision (mAP) on validation set which is superior to the 88.4% mAP of the original SSD. Apart from this, the model can achieve an accuracy of 94.3% while keeping the recall rate in a high level (98.8%) in the nationally distributed test set, outperforming the original SSD method.

[1]  Deren Li,et al.  Object Classification of Aerial Images With Bag-of-Visual Words , 2010, IEEE Geoscience and Remote Sensing Letters.

[2]  Arzu Erener,et al.  Unsupervised building detection in complex urban environments from multispectral satellite imagery , 2012 .

[3]  Albert Y. Zomaya,et al.  Remote sensing big data computing: Challenges and opportunities , 2015, Future Gener. Comput. Syst..

[4]  Shihong Du,et al.  Spectral–Spatial Feature Extraction for Hyperspectral Image Classification: A Dimension Reduction and Deep Learning Approach , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Zhengjun Liu,et al.  Semi-automatic road tracking by template matching and distance transformation in urban areas , 2011 .

[6]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[7]  Line Eikvil,et al.  Classification-based vehicle detection in high-resolution satellite images , 2009 .

[8]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Zhou Guo,et al.  On combining multiscale deep learning features for the classification of hyperspectral remote sensing imagery , 2015 .

[10]  Xin Xu,et al.  Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery , 2017, Remote. Sens..

[11]  Min Wang,et al.  Road extraction from high-spatial-resolution remotely sensed imagery by combining multi-profile analysis and extended Snakes model , 2011 .

[12]  Yanfei Zhong,et al.  Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery , 2018 .

[13]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Cem Ünsalan,et al.  Road Network Detection Using Probabilistic and Graph Theoretical Methods , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[17]  David Zhang,et al.  Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.

[18]  Zhenwei Shi,et al.  Random Access Memories: A New Paradigm for Target Detection in High Resolution Aerial Remote Sensing Images , 2018, IEEE Transactions on Image Processing.

[19]  Uwe Stilla,et al.  Vehicle Detection in Very High Resolution Satellite Images of City Areas , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[21]  Shuang Wang,et al.  A deep learning framework for remote sensing image registration , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yong Wang,et al.  A Novel Vehicle Detection Method With High Resolution Highway Aerial Image , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[26]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[27]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[28]  Jordi Inglada,et al.  Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features , 2007 .

[29]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[30]  Xiaopeng Zhang,et al.  Robust Rooftop Extraction From Visible Band Images Using Higher Order CRF , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yu Li,et al.  Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model , 2012, IEEE Geoscience and Remote Sensing Letters.

[33]  Lianru Gao,et al.  Semantic Labeling of High Resolution Aerial Imagery and LiDAR Data with Fine Segmentation Network , 2018, Remote. Sens..

[34]  Huseyin Gokhan Akcay,et al.  Building detection using directional spatial constraints , 2010, 2010 IEEE International Geoscience and Remote Sensing Symposium.

[35]  Hamid Abrishami Moghaddam,et al.  Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours , 2010, Int. J. Appl. Earth Obs. Geoinformation.

[36]  Yann Le Cun,et al.  A Theoretical Framework for Back-Propagation , 1988 .

[37]  Nikolaos Doulamis,et al.  Deep convolutional neural networks for building extraction from orthoimages and dense image matching point clouds , 2017 .

[38]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[39]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Xiao Xiang Zhu,et al.  Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources , 2017, IEEE Geoscience and Remote Sensing Magazine.

[41]  Lianru Gao,et al.  High-Resolution Aerial Imagery Semantic Labeling with Dense Pyramid Network , 2018, Sensors.

[42]  Bo Li,et al.  Ship Detection in High-Resolution Optical Imagery Based on Anomaly Detector and Local Shape Feature , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[44]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[45]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[46]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[47]  Weitao Chen,et al.  Forested landslide detection using LiDAR data and the random forest algorithm: A case study of the Three Gorges, China , 2014 .