Attention-Based Road Registration for GPS-Denied UAS Navigation.

Matching and registration between aerial images and prestored road landmarks are critical techniques to enhance unmanned aerial system (UAS) navigation in the global positioning system (GPS)-denied urban environments. Current registration processes typically consist of two separate stages of road extraction and road registration. These two-stage registration approaches are time-consuming and less robust to noise. To that end, in this article, we, for the first time, investigate the problem of end-to-end Aerial-Road registration. Using deep learning, we develop a novel attention-based neural network architecture for Aerial-Road registration. In this model, we construct two-branch neural networks with shared weights to map two input images into a common embedding space. Besides, considering that road features are sparsely distributed in images, we incorporate a novel multibranch attention module to filter out false descriptor matches from the indiscriminative background in order to improve registration accuracy. Finally, the results from extensive experiments show that compared with state-of-the-art approaches, the mean absolute errors of our approach in rotation angle and the translations in the x- and y-directions are reduced down by a factor of 1.24, 1.38, and 1.44, respectively. Furthermore, as a byproduct, our experimental results prove the feasibility of a neural network multitask learning approach to simultaneously achieve accurate Aerial-Road matching and registration, thus providing an efficient and accurate UAS geolocalization.

[1]  Sang Uk Lee,et al.  Integrated Position Estimation Using Aerial Image Sequences , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Josef Sivic,et al.  Convolutional Neural Network Architecture for Geometric Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[5]  Bo Zhao,et al.  Diversified Visual Attention Networks for Fine-Grained Object Classification , 2016, IEEE Transactions on Multimedia.

[6]  Teng Wang,et al.  Characterization of mountain drainage patterns for GPS-denied UAS navigation augmentation , 2015, Machine Vision and Applications.

[7]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Serge J. Belongie,et al.  Learning deep representations for ground-to-aerial geolocalization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Bernard Schnaufer,et al.  Meta-image navigation augmenters for unmanned aircraft systems (MINA for UAS) , 2013, Defense, Security, and Sensing.

[11]  Qingjie Liu,et al.  Road Extraction by Deep Residual U-Net , 2017, IEEE Geoscience and Remote Sensing Letters.

[12]  Marius Leordeanu,et al.  Object Contra Context: Dual Local-Global Semantic Segmentation in Aerial Images , 2017, AAAI Workshops.

[13]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[14]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[15]  Dhruv Batra,et al.  Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.

[16]  Marius Leordeanu,et al.  Aerial image geolocalization from recognition and matching of roads and intersections , 2016, BMVC.

[17]  Ming Wu,et al.  D-LinkNet: LinkNet with Pretrained Encoder and Dilated Convolution for High Resolution Satellite Imagery Road Extraction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Yong Fan,et al.  Non-rigid image registration using fully convolutional networks with deep self-supervision , 2017, ArXiv.

[20]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[21]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Shiming Xiang,et al.  Automatic Road Detection and Centerline Extraction via Cascaded End-to-End Convolutional Neural Network , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[25]  Marc Niethammer,et al.  Quicksilver: Fast predictive image registration – A deep learning approach , 2017, NeuroImage.

[26]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[27]  Hervé Delingette,et al.  Robust Non-rigid Registration Through Agent-Based Action Learning , 2017, MICCAI.

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.