Hand tracking based on the combination of 2D and 3D model in gaze-directed video

This paper investigates model based hand tracking in gaze-directed video which contains everyday manipulation activity of human in kitchen environment. The video is recorded by a gaze-directed camera, which can actively directs at the visual attention area from the person who wears the camera. Here we present a method based on the combination of 2D and 3D hand model, which can estimate the position of hand in image accurately and the pose of hand in 3D roughly. The method uses 2D model tracking result to initialize and predict 3D tracking, which saves the number of particles and makes it possible for local configuration adapting. To evaluate our result, we try our algorithm on several pieces of video both from normal camera and gaze-directed camera. The error ratio of the distance between the ground truth and tracking result is used as an objective measurement for evaluating our method. Trajectory of hand movement and results of projected model for every frame show that our method is effective and makes a good foundation for future recognition and analysis.

[1]  Rogério Schmidt Feris,et al.  The isometric self-organizing map for 3D hand pose estimation , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[2]  Markus Ulrich,et al.  Recognition and Tracking of 3D Objects , 2008, DAGM-Symposium.

[3]  Ying Wu,et al.  Capturing natural hand articulation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  M. Beetz,et al.  3D Hand and Object Tracking for Inside Out Activity Analysis , 2009 .

[5]  Ying Wu,et al.  Analyzing and capturing articulated hand motion in image sequences , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Mathias Kölsch,et al.  Robust hand detection , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[8]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[9]  T. Brandt,et al.  A third eye for the surgeon , 2006, Journal of Neurology, Neurosurgery & Psychiatry.

[10]  Anil K. Jain,et al.  Face Detection in Color Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Antonis A. Argyros,et al.  Dynamic time warping for binocular hand tracking and reconstruction , 2008, 2008 IEEE International Conference on Robotics and Automation.

[12]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Paulo R. S. Mendonça,et al.  Model-Based Hand Tracking Using an Unscented Kalman Filter , 2001, BMVC.

[14]  Carlo Tomasi,et al.  3D tracking = classification + interpolation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.