Appearance-Based 3D Upper-Body Pose Estimation and Person Re-identification on Mobile Robots

In the field of human-robot interaction (HRI), detection, tracking and re-identification of humans in a robot's surroundings are crucial tasks, e. g. for socially compliant robot navigation. Besides the 3D position detection, the estimation of a person's upper-body orientation based on monocular camera images is a challenging problem on a mobile platform. To obtain real-time position tracking as well as upper-body orientation estimations, the proposed system comprises discriminative detectors whose hypotheses are tracked by a Kalman filter-based multi-hypotheses tracker. For appearance-based person recognition, a generative approach, based on a 3D shape model, is used to refine these tracked hypotheses. This model evaluates edges and color-based discrimination from the background. Furthermore, for each person the texture of his or her upper-body is learned and used for person re-identification. When computational resources are limited, the update rate of the model-based optimization reduces itself automatically. Thereby the estimation accuracy decreases, but the system keeps tracking the persons around the robot in real-time. The person's 3D pose is tracked up to a distance of 5.0 meters with an average Euclidean error of 18 cm. The achieved motion independent average upper-body orientation error is 22°. Furthermore, the upper-body texture is learned on-line which allowed a stable person re-identification in our experiments.

[1]  Wolfgang Hübner,et al.  Generative 2D and 3D Human Pose Estimation with Vote Distributions , 2012, ISVC.

[2]  Vincent Lepetit,et al.  Human body pose detection using Bayesian spatio-temporal templates , 2006, Comput. Vis. Image Underst..

[3]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Horst-Michael Groß,et al.  Realization and user evaluation of a companion robot for people with mild cognitive impairments , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  Dariu Gavrila,et al.  Multi-view 3D Human Pose Estimation in Complex Environment , 2011, International Journal of Computer Vision.

[6]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[7]  Sven Behnke,et al.  3D Body Pose Estimation Using an Adaptive Person Model for Articulated ICP , 2011, ICIRA.

[8]  Horst-Michael Groß,et al.  Comparison of Laser-Based Person Tracking at Feet and Upper-Body Height , 2011, KI.

[9]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[10]  Horst-Michael Groß,et al.  Interactive mobile robots guiding visitors in a university building , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[11]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[12]  Edward T. Hall,et al.  A System for the Notation of Proxemic Behavior1 , 1963 .

[13]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Donald G. Bailey,et al.  An Efficient Euclidean Distance Transform , 2004, IWCIA.

[15]  Henrik I. Christensen,et al.  Human-robot embodied interaction in hallway settings: a pilot user study , 2005, ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005..

[16]  Hans-Peter Seidel,et al.  Spatio-temporal motion tracking with unsynchronized cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Wolfram Burgard,et al.  Using Boosted Features for the Detection of People in 2D Range Data , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[18]  Horst-Michael Groß,et al.  Further progress towards a home robot companion for people with mild cognitive impairment , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[19]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Christian Vollmer,et al.  Estimation of human upper body orientation for mobile robotics using an SVM decision tree on monocular images , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Horst-Michael Groß,et al.  TOOMAS: Interactive Shopping Guide robots in everyday use - final implementation and experiences from long-term field trials , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Michael Isard,et al.  Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[24]  Simone Re,et al.  Ideas and methods for modeling 3D human figures: the principal algorithms used by MakeHuman and their implementation in a new approach to parametric modeling , 2008, Bangalore Compute Conf..