Selecting best viewpoint for human-pose estimation

Estimating human poses is an important step towards developing robots that can understand human motion. Since a human is highly articulated, changing viewpoints of sensors on robots can improve the accuracy of human-pose estimation. We propose a two-phase approach that determines the best viewpoint of a depth sensor for human-pose estimation. The proposed approach measures the quality of potential viewpoints and selects one of them as the best viewpoint for each human pose. Based on the quality of viewpoints, human poses can be directly mapped to the best viewpoint without reconstructing the human body. Thus, the proposed approach provides a discriminative mapping to determine the best viewpoint for estimating different human poses. To measure the quality of a potential viewpoint, the viewpoint is first instantiated by representing the depth sensor of the viewpoint using the finite projective camera model. The quality of the viewpoint is expressed in terms of the error of human-pose estimates. A mapping is derived by minimizing the error in a human-pose estimate among different viewpoints. The proposed two-phase approach has been evaluated on a benchmark database. Experimental results showed that the best viewpoint for a human pose could be determined by evaluating the quality of potential viewpoints. The mean error and standard deviation of human-pose estimates were reduced by using the best viewpoint determined by the proposed two-phase approach.

[1]  G. Roth,et al.  View planning for automated three-dimensional object reconstruction and inspection , 2003, CSUR.

[2]  David Casasent,et al.  Feature Space Trajectory Methods for Active Computer Vision , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Michael Isard,et al.  Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[4]  Cheng-Kok Koh,et al.  A 3D-point-cloud feature for human-pose estimation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  Dieter Fox,et al.  Autonomous generation of complete 3D object models using next best view manipulation planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Ruzena Bajcsy,et al.  Berkeley MHAD: A comprehensive Multimodal Human Action Database , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[7]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[8]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Sebastian Thrun,et al.  Real-time identification and localization of body parts from depth images , 2010, 2010 IEEE International Conference on Robotics and Automation.

[10]  Konstantinos A. Tarabanis,et al.  A survey of sensor planning in computer vision , 1995, IEEE Trans. Robotics Autom..

[11]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[12]  Luc Van Gool,et al.  2D Action Recognition Serves 3D Human Pose Estimation , 2010, ECCV.

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  Cheng-Kok Koh,et al.  Using action classification for human-pose estimation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.