Evaluating Recognition-Based Motion Capture on HumanEva II Test Data

The advent of the HumanEva standardized motion capture data sets has enabled quantitative evaluation of motion capture algorithms on comparable terms. This paper measures the performance of an existing monocular recognition-based pose recovery algorithm on select HumanEva data, including all the HumanEva II clips. The method uses a physically-motivated Markov process to connect adajacent frames and achieve a 3D relative mean error of 8.9 cm per joint, better than recently reported results. It further investigates factors contributing to the error, and finds that research into better pose retrieval methods offers promise for improvement of this technique and those related to it. Finally, it investigates the effects of local search optimization with the same recognition-based algorithm and finds no significant deterioration in the results, indicating that processing speed can be largely independent of the size of the recognition library for this approach.

[1]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Cristian Sminchisescu,et al.  Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[4]  Baozong Yuan,et al.  Better Foreground Segmentation for Static Cameras via New Energy Form and Dynamic Graph-cut , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Edward A. Krause Motion estimation for frame-rate conversion , 1987 .

[6]  Pushmeet Kohli,et al.  PoseCut: Simultaneous Segmentation and 3D Pose Estimation of Humans Using Dynamic Graph-Cuts , 2006, ECCV.

[7]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Nicholas R. Howe,et al.  Recognition-Based Motion Capture and the HumanEva II Test Data , 2007, CVPR 2007.

[9]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Stan Sclaroff,et al.  Segmenting foreground objects from a dynamic textured background via a robust Kalman filter , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Nicholas R. Howe Flow lookup and biological motion perception , 2005, IEEE International Conference on Image Processing 2005.

[12]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[13]  Rómer Rosales,et al.  Combining Generative and Discriminative Models in a Framework for Articulated Pose Estimation , 2006, International Journal of Computer Vision.

[14]  Nicholas R. Howe,et al.  Silhouette lookup for monocular 3D pose tracking , 2007, Image Vis. Comput..

[15]  David A. Forsyth,et al.  Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis , 2005, Found. Trends Comput. Graph. Vis..

[16]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Andrew W. Fitzgibbon,et al.  The Joint Manifold Model for Semi-supervised Multi-valued Regression , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Ghassan Hamarneh,et al.  Human Limb Delineation and Joint Position Recovery Using Localized Boundary Models , 2007, 2007 IEEE Workshop on Motion and Video Computing (WMVC'07).

[21]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[22]  Andrew M. Wallace,et al.  Evaluation of a hierarchical partitioned particle filter with action primitives , 2007, CVPR 2007.

[23]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[24]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[25]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[27]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[28]  Nicholas R. Howe,et al.  Silhouette Lookup for Automatic Pose Tracking , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[29]  Andrea Fusiello,et al.  Mosaic of a video shot with multiple moving objects , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[30]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[31]  Michael J. Black,et al.  Predicting 3D People from 2D Pictures , 2006, AMDO.

[32]  Nicholas R. Howe Evaluating Lookup-Based Monocular Human Pose Tracking on the HumanEva Test Data , 2006, NIPS 2006.