论文信息 - Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses

Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses

We present an approach to multi-target tracking that has expressive potential beyond the capabilities of chain-shaped hidden Markov models, yet has significantly reduced complexity. Our framework, which we call tracking-by-selection, is similar to tracking-by-detection in that it separates the tasks of detection and tracking, but it shifts temporal reasoning from the tracking stage to the detection stage. The core feature of tracking-by-selection is that it reasons about path hypotheses that traverse the entire video instead of a chain of single-frame object hypotheses. A traditional chain-shaped tracking-by-detection model is only able to promote consistency between one frame and the next. In tracking-by-selection, path hypotheses exist across time, and encouraging long-term temporal consistency is as simple as rewarding path hypotheses with consistent image features. One additional advantage of tracking-by-selection is that it results in a dramatically simplified model that can be solved exactly. We adapt an existing tracking-by-detection model to the tracking-by-selection framework, and show improved performance on a challenging dataset.

Silvio Savarese | Wongun Choi | Ryan Tokola

[1] Deva Ramanan,et al. Robust Tracking of the Upper Limb for Functional Stroke Assessment , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[2] Leonid Sigal,et al. Human Context: Modeling Human-Human Interactions for Monocular 3D Pose Estimation , 2012, AMDO.

[3] Stefan Roth,et al. People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Deva Ramanan,et al. Learning to parse images of articulated bodies , 2006, NIPS.

[5] Ben Taskar,et al. Cascaded Models for Articulated Pose Estimation , 2010, ECCV.

[6] David J. Fleet,et al. 3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[8] Vittorio Ferrari,et al. Better Appearance Models for Pictorial Structures , 2009, BMVC.

[9] Varun Ramakrishna,et al. Tracking Human Pose by Tracking Symmetric Parts , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Luc Van Gool,et al. Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11] Silvio Savarese,et al. Articulated part-based model for joint object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[12] Bernt Schiele,et al. Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[13] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[14] Afshin Dehghan,et al. GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs , 2012, ECCV.

[15] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[16] Daniel P. Huttenlocher,et al. Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17] Luc Van Gool,et al. Articulated Multi-body Tracking under Egomotion , 2008, ECCV.

[18] Ben Taskar,et al. Parsing human motion with stretchable models , 2011, CVPR 2011.

[19] David A. Forsyth,et al. Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20] Yi Yang,et al. Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[21] Ben Taskar,et al. Sidestepping Intractable Inference with Structured Ensemble Cascades , 2010, NIPS.

[22] David A. Forsyth,et al. Human Tracking with Mixtures of Trees , 2001, ICCV.

[23] Silvio Savarese,et al. An efficient branch-and-bound algorithm for optimal human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Andrew Zisserman,et al. Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Sidharth Bhatia,et al. Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[26] Mei Han,et al. Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.