Beyond Pedestrians: A Hybrid Approach of Tracking Multiple Articulating Humans

We propose a hybrid framework to address the problem of tracking multiple articulated humans from a single camera. Our method incorporates offline learned category-level detector with online learned instance-specific detector as a hybrid system. To deal with humans in large pose articulation, which can not be reliably detected by off-line trained detectors, we propose an online learned instance specific patch-based detector, consisting of layered patch classifiers. With extrapolated track lets by online learned detectors, we use the discriminative color filters learned online to compute the appearance affinity score for further global association. Experimental evaluation on both standard pedestrian datasets and articulated human datasets shows significant improvement compared to state-of-the-art multi-human tracking methods.

[1]  Pushmeet Kohli,et al.  On Detection of Multiple Object Instances Using Hough Transforms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ales Leonardis,et al.  An adaptive coupled-layer visual model for robust visual tracking , 2011, 2011 International Conference on Computer Vision.

[3]  Afshin Dehghan,et al.  GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs , 2012, ECCV.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Martial Hebert,et al.  Volumetric Features for Video Event Detection , 2010, International Journal of Computer Vision.

[6]  Yang Wang,et al.  Recognizing human actions from still images with latent poses , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ramakant Nevatia,et al.  Online Learned Discriminative Part-Based Appearance Models for Multi-human Tracking , 2012, ECCV.

[9]  Ram Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, CVPR.

[10]  Ramakant Nevatia,et al.  Robust Object Tracking by Hierarchical Association of Detection Responses , 2008, ECCV.

[11]  Ramakant Nevatia,et al.  Efficient Inference with Multiple Heterogeneous Part Detectors for Human Pose Estimation , 2010, ECCV.

[12]  Gérard G. Medioni,et al.  Context tracker: Exploring supporters and distracters in unconstrained environments , 2011, CVPR 2011.

[13]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Horst Bischof,et al.  Hough-based tracking of non-rigid objects , 2011, 2011 International Conference on Computer Vision.

[15]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[16]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Konrad Schindler,et al.  Multi-target tracking by continuous energy minimization , 2011, CVPR 2011.

[19]  Jordi Gonzàlez,et al.  A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[20]  Ben Taskar,et al.  Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[21]  Junseok Kwon,et al.  Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling , 2009, CVPR.

[22]  Ramakant Nevatia,et al.  Robust Object Tracking Using Constellation Model with Superpixel , 2012, ACCV.

[23]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Shihong Lao,et al.  Multiple Player Tracking in Sports Video: A Dual-Mode Two-Way Bayesian Inference Approach With Progressive Observation Modeling , 2011, IEEE Transactions on Image Processing.

[26]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[27]  Zhen Qin,et al.  Improving multi-target tracking via social grouping , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[29]  Ramakant Nevatia,et al.  High performance object detection by collaborative learning of Joint Ranking of Granules features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Margrit Betke,et al.  Coupling detection and data association for multiple object tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Afshin Dehghan,et al.  Part-based multiple-person tracking with partial occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Bi Song,et al.  A Stochastic Graph Evolution Framework for Robust Multi-target Tracking , 2010, ECCV.

[33]  Luc Van Gool,et al.  Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[34]  Ramakant Nevatia,et al.  How does person identity recognition help multi-person tracking? , 2011, CVPR 2011.

[35]  Ramakant Nevatia,et al.  Multi-target tracking by on-line learned discriminative appearance models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.