Multi-Modal Human Detection from Aerial Views by Fast Shape-Aware Clustering and Classification

Recognizing humans from aerial views represents an increasingly relevant endeavor; a trend mainly driven by the widespread use of unmanned aerial vehicles (UAVs). An accurate and real-time visual human recognition task, however, represents a scientific challenge because typical UAV imaging and computational capabilities and conditions introduce complexities and constraints. Motion blur, the non-specific top-view appearance of humans, low-image resolution and limited onboard computational resources are among the most important limiting factors to be considered. In this paper we propose a run-time-efficient multi-modal detection framework performing clustering and recognition on thermal infrared, passive stereo depth and intensity channels in order to cope with the above complexities and to achieve accurate human detection results. Thermal infrared and depth data are used to generate proposals in combination with an explicit, tree-structured shape representation driven clustering scheme. Generated proposals are used as an input for a discriminatively trained deep classification step to recognize humans. The proposed clustering and classification scheme is validated in qualitative and quantitative terms on four large aerial datasets representing complex (small objects, clutter, occlusions) situations.

[1]  Rainer Stiefelhagen,et al.  Local appearance based face recognition using discrete cosine transform , 2005, 2005 13th European Signal Processing Conference.

[2]  P. Rudol,et al.  Human Body Detection and Geolocalization for UAV Search and Rescue Missions Using Color and Thermal Imagery , 2008, 2008 IEEE Aerospace Conference.

[3]  Heiko Neumann,et al.  Fully Convolutional Region Proposal Networks for Multispectral Person Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Markus Vincze,et al.  A fast stereo matching algorithm suitable for embedded real-time systems , 2010, Comput. Vis. Image Underst..

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jiejie Zhu,et al.  Pedestrian Detection in Low-Resolution Imagery by Learning Multi-scale Intrinsic Motion Structures (MIMS) , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Jürgen Beyerer,et al.  Low Resolution Person Detection with a Moving Thermal Infrared Camera by Hot Spot Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Matthew E. Antone,et al.  Detecting and tracking all moving objects in wide-area aerial video , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Agathoniki Trigoni,et al.  Supporting Search and Rescue Operations with UAVs , 2010, 2010 International Conference on Emerging Security Technologies.

[11]  Roland Siegwart,et al.  Omnidirectional visual obstacle detection using embedded FPGA , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Xinkai Wu,et al.  Pedestrian Detection and Tracking from Low-Resolution Unmanned Aerial Vehicle Thermal Imagery , 2016, Sensors.

[13]  Gérard G. Medioni,et al.  Exploring Local Context for Multi-target Tracking in Wide Area Aerial Surveillance , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Marco A. Wehrmeister,et al.  Towards Real-Time People Recognition on Aerial Imagery Using Convolutional Neural Networks , 2016, 2016 IEEE 19th International Symposium on Real-Time Distributed Computing (ISORC).

[15]  Andreas Zweng,et al.  Multi-resolution binary shape tree for efficient 2D clustering , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[16]  Kang Ryoung Park,et al.  Human Detection Based on the Generation of a Background Image by Using a Far-Infrared Light Camera , 2015, Sensors.

[17]  Toby P. Breckon,et al.  Real-time people and vehicle detection from UAV imagery , 2011, Electronic Imaging.

[18]  Vincent Lepetit,et al.  Flying objects detection from a single moving camera , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[20]  Lars Wilko Sommer,et al.  A comprehensive study on object proposals methods for vehicle detection in aerial images , 2016, 2016 9th IAPR Workshop on Pattern Recogniton in Remote Sensing (PRRS).