Interactive Person Following and Gesture Recognition with a Flying Robot

Gesture recognition and person following play a vital role in social robotics. In this paper, we present an approach that allows a quadrocopter to follow a person and to recognize simple gestures using an onboard depth camera. This enables novel applications such as hands-free video recording and picture taking. Moving platforms with an onboard camera make the problem of tracking a person highly challenging. To overcome this problem, we stabilize the depth image by warping it to a virtual-static camera, using the estimated pose of the quadrocopter obtained from vision and inertial sensors using an Extended Kalman filter. The stabilized depth video can be used with state of the art motion capture solutions such as the OpenNI tracker. It allows us to obtain the full body pose. The pose can then for example be used to recognize simple gestures to control the quadrocopter’s behaviour. Our approach recognizes a small set of example commands (“follow me”, “take picture”, “land”), and generate corresponding motion commands . We demonstrate the practical performance of our approach in an extensive set of experiments with a quadrocopter.

[1]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Kai Oliver Arras,et al.  People tracking in RGB-D data with on-line boosted target models , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Wolfram Burgard,et al.  People Tracking with Mobile Robots Using Sample-Based Joint Probabilistic Data Association Filters , 2003, Int. J. Robotics Res..

[4]  Michael Bosse,et al.  Non-metric image-based rendering for video stabilization , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[6]  Martin Frassl,et al.  A prototyping environment for interaction between a human and a robotic multi-agent system , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[7]  Keita Higuchi,et al.  Flying sports assistant: external visual imagery representation for sports training , 2011, AH '11.

[8]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[9]  Christoph Mertz,et al.  Pedestrian Detection and Tracking Using Three-dimensional LADAR Data , 2010, Int. J. Robotics Res..

[10]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Irfan A. Essa,et al.  Auto-directed video stabilization with robust L1 optimal camera paths , 2011, CVPR 2011.

[12]  Myung Jin Chung,et al.  3D video stabilization for a humanoid robot using point feature trajectory smoothing , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[13]  Dieter Schmalstieg,et al.  ARToolKitPlus for Pose Trackin on Mobile Devices , 2007 .

[14]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Rama Chellappa,et al.  Evaluation of image stabilization algorithms , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[16]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Shane Brennan,et al.  A Fast Stereo-based System for Detecting and Tracking Pedestrians from a Moving Vehicle , 2009, Int. J. Robotics Res..

[18]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[19]  Jiajun Bu,et al.  Video stabilization with a depth camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Greg Mori,et al.  HRI in the sky: Creating and commanding teams of UAVs with a vision-mediated gestural interface , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Daniel Cremers,et al.  Real-time human motion tracking using multiple depth cameras , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Roland Siegwart,et al.  Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[23]  Florian Mueller,et al.  Joggobot: a flying robot as jogging companion , 2012, CHI Extended Abstracts.

[24]  Dariu Gavrila,et al.  Pedestrian Detection from a Moving Vehicle , 2000, ECCV.

[25]  Daniel Cremers,et al.  FollowMe: Person following and gesture recognition with a quadrocopter , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Dieter Schmalstieg,et al.  Artoolkitplus for pose tracking on mobile devices , 2007 .

[28]  Harry Shum,et al.  Full-frame video stabilization with motion inpainting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Luc Van Gool,et al.  Markerless tracking of complex human motions from multiple views , 2006, Comput. Vis. Image Underst..

[30]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.