Tracking Human Pose by Tracking Symmetric Parts

The human body is structurally symmetric. Tracking by detection approaches for human pose suffer from double counting, where the same image evidence is used to explain two separate but symmetric parts, such as the left and right feet. Double counting, if left unaddressed can critically affect subsequent processes, such as action recognition, affordance estimation, and pose reconstruction. In this work, we present an occlusion aware algorithm for tracking human pose in an image sequence, that addresses the problem of double counting. Our key insight is that tracking human pose can be cast as a multi-target tracking problem where the ”targets” are related by an underlying articulated structure. The human body is modeled as a combination of singleton parts (such as the head and neck) and symmetric pairs of parts (such as the shoulders, knees, and feet). Symmetric body parts are jointly tracked with mutual exclusion constraints to prevent double counting by reasoning about occlusion. We evaluate our algorithm on an outdoor dataset with natural background clutter, a standard indoor dataset (HumanEva-I), and compare against a state of the art pose estimation algorithm.

[1]  Yang Wang,et al.  Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation , 2008, ECCV.

[2]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[3]  Shimon Ullman,et al.  Using Linking Features in Learning Non-parametric Part Models , 2012, ECCV.

[4]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Hao Jiang,et al.  Human Pose Estimation Using Consistent Max Covering , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Fei-Fei Li,et al.  Action Recognition with Exemplar Based 2.5D Graph Matching , 2012, ECCV.

[7]  Stephen A Engel,et al.  Motion from occlusion. , 2006, Journal of vision.

[8]  Konrad Schindler,et al.  Globally Optimal Multi-target Tracking on a Hexagonal Lattice , 2010, ECCV.

[9]  David A. Forsyth,et al.  Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Ben Taskar,et al.  Parsing human motion with stretchable models , 2011, CVPR 2011.

[11]  Edward Courtney,et al.  2 = 4 M , 1993 .

[12]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[14]  Cristian Sminchisescu,et al.  BM³E : Discriminative Density Propagation for Visual Tracking , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Hao Jiang,et al.  Global pose estimation using non-tree models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[17]  Sharath Pankanti,et al.  Hand tracking by binary quadratic programming and its application to retail activity recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Larry S. Davis,et al.  Tracking People's Hands and Feet Using Mixed Network AND/OR Search , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[20]  Adrian Hilton,et al.  Visual Analysis of Humans - Looking at People , 2013 .

[21]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Michael J. Black,et al.  Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Daniel P. Huttenlocher,et al.  Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[25]  J. Gibson,et al.  The change from visible to invisible: , 1969 .

[26]  Silvio Savarese,et al.  An efficient branch-and-bound algorithm for optimal human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Pascal Fua,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[28]  Deva Ramanan,et al.  N-best maximal decoders for part models , 2011, 2011 International Conference on Computer Vision.

[29]  James J. Little,et al.  A Linear Programming Approach for Multiple Object Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Ankur Agarwal,et al.  Tracking Articulated Motion Using a Mixture of Autoregressive Models , 2004, ECCV.

[31]  Alexei A. Efros,et al.  People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.

[32]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[33]  Daniel P. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[34]  Yaser Sheikh,et al.  3D reconstruction of a smooth articulated trajectory from a monocular image sequence , 2011, 2011 International Conference on Computer Vision.

[35]  J J Gibson,et al.  What gives rise to the perception of motion? , 1968, Psychological review.

[36]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.