Deep learning object recognition in multi-spectral UAV imagery

The application area of unmanned aerial vehicles increases significantly recent years due to progress in hardware and algorithms for data acquisition and processing. Object detection and classification (recognition) in imagery acquired by unmanned aerial vehicle are the key tasks for many applications, and usually in practice an operator solves these tasks. Growing amount of data of different types and of different nature provides the possibility for deep machine learning which nowadays shows high level results for object detection and recognition. Two key problems are to be solved for applying deep learning for object recognition task when dealing with multi-spectral imagery: (a) availability of representative dataset for neural network training and testing and (b) effective way of multi-spectral data fusion during neural network training. The paper proposes the approaches for solving these problems. For creating a representative dataset synthetic infra-red images are generated using several real infra-red images and 3D model of a given object. An technique for realistic infra-red texturing based on accurate infra-red image exterior orientation and 3D model pose estimation is developed. It allows in automated mode to produce datasets of required volume for deep learning and automatically generate ground truth data for neural network training and testing. Two approaches for multi-spectral data fusion for object recognition are developed and evaluated: data level fusion and results level fusion. The results of the evaluation of both techniques on generated multi-spectral dataset are presented and discussed.

[1]  Samir Ilias,et al.  On the figure of merit of uncooled bolometers fabricated at INO , 2016, SPIE Defense + Security.

[2]  Namil Kim,et al.  Multispectral pedestrian detection: Benchmark dataset and baseline , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Zheng Liu,et al.  Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness , 2017, ArXiv.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[6]  Bruce A. Draper,et al.  Semi-Nonnegative Matrix Factorization for Motion Segmentation with Missing Data , 2012, ECCV.

[7]  Yury Vizilter,et al.  Deep Learning of Convolutional Auto-Encoder for Image Matching and 3D Object Reconstruction in the Infrared Range , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[8]  Donald Prévost,et al.  Combination of colour and thermal sensors for enhanced object detection , 2007, 2007 10th International Conference on Information Fusion.

[9]  Sergey Yu. Zheltov,et al.  Accuracy evaluation of structure from motion surface 3D reconstruction , 2017, Optical Metrology.

[10]  Angel Domingo Sappa,et al.  Feature Point Descriptors: Infrared and Visible Spectra , 2014, Sensors.

[11]  Michael Felsberg,et al.  A thermal Object Tracking benchmark , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[12]  Sergey Yu. Zheltov,et al.  Robust object tracking techniques for vision-based 3D motion analysis applications , 2016, SPIE Photonics Europe.

[13]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[14]  Michael T. Wolf,et al.  VAIS: A dataset for recognizing maritime imagery in the visible and infrared spectrums , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Michael Felsberg,et al.  A thermal infrared dataset for evaluation of short-term tracking methods , 2015 .

[16]  James W. Davis,et al.  A Two-Stage Template Approach to Person Detection in Thermal Imagery , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[17]  Vladimir V. Kniaz,et al.  AN ALGORITHM FOR PEDESTRIAN DETECTION IN MULTISPECTRAL IMAGE SEQUENCES , 2017 .

[18]  Li Dong,et al.  HOG based multi-stage object detection and pose recognition for service robot , 2010, 2010 11th International Conference on Control Automation Robotics & Vision.

[19]  Marko Heikkilä,et al.  A texture-based method for modeling the background and detecting moving objects , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..