The ability for an autonomous robot to track and identify multiple humans and understand their intentions is crucial for socialized human-robot interactions in dynamic environments (Michalowski and Simmons 2006). Take CoBot (Rosenthal, Biswas, and Veloso 2010) trying to enter an elevator as an example. When the elevator door opens, suppose there are multiple humans occupied, CoBot needs to track each human’s state and intention in terms of whether he/she is going to exit the elevator or not. For the purposes of safely and friendly interacting with humans, CoBot can only make the decision to enter the elevator when any human who intends to exit is believed to have exited. Most multi-object tracking (MOT) methods follow a tracking-by-detection paradigm (Yilmaz, Javed, and Shah 2006; Andriluka, Roth, and Schiele 2008). In this setting, an object detector runs on each frame to obtain a set of detections as inputs for a tracker. Tracking-by-detection algorithms can be roughly classified into two groups: online and offline. Online tracking intends to recursively estimate the current situation given past observations in a Bayesian way; offline tracking is typically formulated as a global optimization problem to find optimal paths given the whole sequence of observations. In this paper, we focus on online tracking which is more suitable for applications on robots. Provided with an inevitable imperfect human detector (with false and missing detections occasionally), we model the intention-aware online multi-human tracking problem as a hidden Markov model (HMM). Formally, a joint state S is defined as a set of all humans, S = {hi}i=1:|S|, where a human is represented as a high-dimensional vector h = (s, i). Here s = (x, y, ẋ, ẏ) is the physical state, and i ∈ I is an intention. In our experiments, we introduce a moving intention to move to a potential goal, and a staying intention to stay almost in the same area. In MOT domain, most existing approaches assume one
[1]
Luc Van Gool,et al.
Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera
,
2011,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2]
Manuela M. Veloso,et al.
Localization and navigation of the CoBots over long-term deployments
,
2013,
Int. J. Robotics Res..
[3]
Anton Milan.
Energy minimization for multiple object tracking
,
2013
.
[4]
Rainer Stiefelhagen,et al.
Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
,
2008,
EURASIP J. Image Video Process..
[5]
Reid G. Simmons,et al.
Multimodal person tracking and attention classification
,
2006,
HRI '06.
[6]
Stefan Roth,et al.
People-tracking-by-detection and people-detection-by-tracking
,
2008,
2008 IEEE Conference on Computer Vision and Pattern Recognition.
[7]
Ian D. Reid,et al.
Latent Data Association: Bayesian Model Selection for Multi-target Tracking
,
2013,
2013 IEEE International Conference on Computer Vision.
[8]
Stephanie Rosenthal,et al.
An effective personal mobile robot agent through symbiotic human-robot interaction
,
2010,
AAMAS.
[9]
A. Tustin.
Automatic Control
,
1951,
Nature.
[10]
J. Ferryman,et al.
PETS2009: Dataset and challenge
,
2009,
2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance.
[11]
Afshin Dehghan,et al.
GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs
,
2012,
ECCV.
[12]
Yaakov Bar-Shalom,et al.
Sonar tracking of multiple targets using joint probabilistic data association
,
1983
.