The fusion of complementary information from co-registered multi-modal image data enables a more detailed and more robust understanding of an image scene or specific objects, and is important for several applications in the field of remote sensing. In this paper, the benefits of combining RGB, near infrared (NIR) and thermal infrared (TIR) aerial images for the task of semantic vehicle segmentation through deep neural networks are investigated. Therefore, RGB, NIR and TIR image triplets acquired by the Modular Aerial Camera System (MACS) are precisely co-registered through the application of a virtual camera system and subsequently used for the training of different neural network architectures. Various experiments were conducted to investigate the influence of the different sensor characteristics and an early or late fusion within the network on the quality of the segmentation results.
[1]
Andrew Zisserman,et al.
Very Deep Convolutional Networks for Large-Scale Image Recognition
,
2014,
ICLR.
[2]
Naif Alajlan,et al.
Deep Learning Approach for Car Detection in UAV Imagery
,
2017,
Remote. Sens..
[3]
Mauricio Galo,et al.
Generating Virtual Images from Oblique Frames
,
2013,
Remote. Sens..
[4]
K. Jacobsen,et al.
GEOMETRIC CALIBRATION OF THE DMC: METHOD AND RESULTS
,
2002
.
[5]
Daniel Hein,et al.
MACS-Mar: a real-time remote sensing system for maritime security applications
,
2019
.
[6]
Peter Reinartz,et al.
SEGMENT-AND-COUNT: VEHICLE COUNTING IN AERIAL IMAGERY USING ATROUS CONVOLUTIONAL NEURAL NETWORKS
,
2018
.