Joint operator detection and tracking for person following from mobile platforms

In this paper, we propose an integrated system to detect and track a single operator that can switch off and on when it leaves and (re-)enters the scene. Our method is based on a set-valued Bayes-optimal state estimator that integrates RGB-D detections and image-based classification to improve tracking results in severe clutter and under long-term occlusion. The classifier is trained in two stages: First, we train a deep convolutional neural network to obtain a feature representation for person re-identification. Then, we bootstrap a classifier that discriminates the operator from remaining people on the output of the state-estimator. We evaluate the approach on a publicly available multi-target tracking dataset as well as custom datasets that are specific to our problem formulation. Experimental results suggest reliable tracking accuracy in crowded scenes and robust re-detection after long-term occlusion.

[1]  Ian D. Reid,et al.  Joint Probabilistic Data Association Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  R. Mahler Multitarget Bayes filtering via first-order multitarget moments , 2003 .

[3]  James M. Rehg,et al.  Multiple Hypothesis Tracking Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Joelle Pineau,et al.  Person tracking and following with 2D laser scanners , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Ronald P. S. Mahler,et al.  Statistical Multisource-Multitarget Information Fusion , 2007 .

[6]  Ba-Ngu Vo,et al.  CPHD Filtering With Unknown Clutter Rate and Detection Profile , 2011, IEEE Transactions on Signal Processing.

[7]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[8]  Dietrich Paulus,et al.  Global data association for the Probability Hypothesis Density filter using network flows , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Ba-Ngu Vo,et al.  Gaussian mixture PHD and CPHD filtering with partially uniform target birth , 2012, 2012 15th International Conference on Information Fusion.

[10]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Ba-Ngu Vo,et al.  A Tutorial on Bernoulli Filters: Theory, Implementation and Applications , 2013, IEEE Transactions on Signal Processing.

[12]  Roman P. Pflugfelder,et al.  Consensus-based matching and tracking of keypoints for object tracking , 2014, IEEE Winter Conference on Applications of Computer Vision.

[13]  Matteo Munaro,et al.  Tracking people within groups with RGB-D data , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Ba-Ngu Vo,et al.  The Cardinality Balanced Multi-Target Multi-Bernoulli Filter and Its Implementations , 2009, IEEE Transactions on Signal Processing.

[15]  Henrik I. Christensen,et al.  Tracking for following and passing persons , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[17]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[18]  Matteo Munaro,et al.  Fast RGB-D people tracking for service robots , 2014, Auton. Robots.

[19]  Wolfram Burgard,et al.  Using Boosted Features for the Detection of People in 2D Range Data , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[20]  Reid G. Simmons,et al.  Natural person-following behavior for social robots , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[21]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[22]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..