Robust multi-pose face tracking by multi-stage tracklet association

We propose an approach for multi-pose face tracking by association of face detection responses in two stages using multiple cues. The low-level stage uses a two-threshold strategy to merge detection responses based on location, size and pose, resulting in short but reliable tracklets. The high-level stage uses different cues for computing a joint similarity measure between tracklets. The facial cue compares facial features of the most frontal face detections in pairs of tracklets. The classifier cue learns a discriminative appearance model for each tracklet, using detection pairs within reliable tracklets and between overlapping tracklets as training data. The constraint cue observes the compatibility of motion of two tracklets. The association of tracklets is globally optimized with the Hungarian algorithm. We validate our approach on two challenging episodes of two TV series and report a Multiple Object Tracking Accuracy (MOTA) of 82% and 68.2%, respectively.

[1]  Rainer Stiefelhagen,et al.  Multi-pose Face Recognition for Person Retrieval in Camera Networks , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[2]  Rainer Stiefelhagen,et al.  Analysis of Local Appearance-Based Face Recognition: Effects of Feature Selection and Feature Normalization , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[3]  Andrew Zisserman,et al.  “Who are you?” - Learning person specific classifiers from video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Ramakant Nevatia,et al.  Robust Object Tracking by Hierarchical Association of Detection Responses , 2008, ECCV.

[6]  A. G. Amitha Perera,et al.  Multi-Object Tracking Through Simultaneous Long Occlusions and Split-Merge Conditions , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Christian Küblbeck,et al.  Face detection and tracking in video sequences using the modifiedcensus transformation , 2006, Image Vis. Comput..

[8]  Bi Song,et al.  A Stochastic Graph Evolution Framework for Robust Multi-target Tracking , 2010, ECCV.

[9]  Ramakant Nevatia,et al.  How does person identity recognition help multi-person tracking? , 2011, CVPR 2011.

[10]  Ramakant Nevatia,et al.  Multi-target tracking by on-line learned discriminative appearance models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..