3D Pose Tracking Using a Recovered 3D Model

Template warping is a widely used approach in the pose recovery field. Due to its flexibility, template warping is especially useful in monocular video motion tracking. Tracking accuracy is largely determined by the appearance of the 3D model. Moreover, the appearance of the 3D model plays a very important role in template warping. However, accurate tracking results are difficult to obtain because hand estimated 3D models are prone to inaccuracies (the appearance of 3D models is commonly initialized by the user based on the first frame of video footage; the result is a hand estimated 3D model). In order to overcome this problem, we propose an iterative optimization process that estimates the appearance of the 3D model during tracking. Our results show that our proposed 3D model estimation achieves better performance than the hand model initialization estimation.

[1]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[3]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Bodo Rosenhahn,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Combined Region-and Motion-based 3d Tracking of Rigid and Articulated Objects , 2022 .

[5]  Patrick Bouthemy,et al.  A 2D-3D model-based approach to real-time visual tracking , 2001, Image Vis. Comput..

[6]  Javier R. Movellan,et al.  Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[7]  R. J. Althof,et al.  A rapid and automatic image registration algorithm with subpixel accuracy , 1997, IEEE Transactions on Medical Imaging.

[8]  Bodo Rosenhahn,et al.  Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking , 2007, International Journal of Computer Vision.

[9]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[10]  Ben Taskar,et al.  Parsing human motion with stretchable models , 2011, CVPR 2011.

[11]  Jing Xiao,et al.  Robust full‐motion recovery of head by dynamic templates and re‐registration techniques , 2003 .

[12]  Rui Sun,et al.  Accurate human face pose recovery from single image through generic shape regularization , 2015, Signal Process..

[13]  David J. Fleet,et al.  Model-Based 3D Hand Pose Estimation from Monocular Video , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Beiji Zou,et al.  Automatic reconstruction of 3D human motion pose from uncalibrated monocular video sequences based on markerless human motion tracking , 2009, Pattern Recognition.

[15]  Liang Wu,et al.  Recovering upper-body motion using a reinitialization particle filter , 2013, J. Electronic Imaging.

[16]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.