Follow Me: Real-Time in the Wild Person Tracking Application for Autonomous Robotics

In the last 20 years there have been major advances in autonomous robotics. In IoT (Industry 4.0), mobile robots require more intuitive interaction possibilities with humans in order to expand its field of applications. This paper describes a user-friendly setup, which enables a person to lead the robot in an unknown environment. The environment has to be perceived by means of sensory input. For realizing a cost and resource efficient Follow Me application we use a single monocular camera as low-cost sensor. For efficient scaling of our Simultaneous Localization and Mapping (SLAM) algorithm, we integrate an inertial measurement unit (IMU) sensor. With the camera input we detect and track a person. We propose combining state of the art deep learning with Convolutional Neural Network (CNN) and SLAM algorithms functionality on the same input camera image. Based on the output robot navigation is possible. This work presents the specification, workflow for an efficient development of the Follow Me application. Our application’s delivered point clouds are also used for surface construction. For demonstration, we use our platform SCITOS G5 equipped with the afore mentioned sensors. Preliminary tests show the system works robustly in the wild (This work is partially supported by a grant of the BMBF FHprofUnt program, no. 03FH049PX5).

[1]  Mohammad Bagher Menhaj,et al.  Multiple Target Tracking for Mobile Robots Using the JPDAF Algorithm , 2009, Tools and Applications with Artificial Intelligence.

[2]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Horst-Michael Groß,et al.  User recognition for guiding and following people with a mobile robot in a clinical environment , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Jörg Stückler,et al.  Large-scale direct SLAM with stereo cameras , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Wolfram Burgard,et al.  An evaluation of the RGB-D SLAM system , 2012, 2012 IEEE International Conference on Robotics and Automation.

[9]  Jörg Stückler,et al.  Direct visual-inertial odometry with stereo cameras , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Daniel Cremers,et al.  Scale-aware navigation of a low-cost quadrocopter with a monocular camera , 2014, Robotics Auton. Syst..

[11]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[12]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[13]  William J. Christmas,et al.  A Multiresolution 3D Morphable Face Model and Fitting Framework , 2016, VISIGRAPP.

[14]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[15]  Daniel Cremers,et al.  Semi-dense Visual Odometry for a Monocular Camera , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Daniel Cremers,et al.  Camera-based navigation of a low-cost quadrocopter , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Ivan Laptev,et al.  Context-Aware CNNs for Person Head Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Daniel Cremers,et al.  Submap-Based Bundle Adjustment for 3D Reconstruction from RGB-D Data , 2014, GCPR.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Ho-fung Leung,et al.  A Distributed Mechanism for Non-transferable Utility Buyer Coalition Problem , 2007 .

[21]  Matthias Rätsch,et al.  Closed-form Solution for IMU based LSD-SLAM Point Cloud Conversion into the Scaled 3D World Environment , 2017, ArXiv.

[22]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using orthonormal matrices , 1988 .

[23]  Greg Welch,et al.  An Introduction to Kalman Filter , 1995, SIGGRAPH 2001.

[24]  Juan D. Tardós,et al.  Probabilistic Semi-Dense Mapping from Highly Accurate Feature-Based Monocular SLAM , 2015, Robotics: Science and Systems.