Hand gesture estimation and model refinement using monocular camera-ambiguity limitation by inequality constraints

The paper proposes a method to precisely estimate the pose (joint angles) of a moving human hand and also refine the 3D shape (widths and lengths) of the given hand model from a monocular image sequence which contains no depth data. First, given an initial rough shaped 3D model, possible pose candidates are generated in a search space efficiently reduced using silhouette features and motion prediction. Then, selecting the candidates with high posterior probabilities, the rough poses are obtained and the feature correspondence is resolved even under quick motion and self occlusion. Next, in order to refine both the 3D shape model and the rough pose under the depth ambiguity in monocular images, the paper proposes an ambiguity limitation method by loose constraint knowledge of the object represented as inequalities. The method calculates the probability distribution satisfying both the observation and the constraints. When multiple solutions are possible, they are preserved until a unique solution is determined. Experimental results show that the depth ambiguity is incrementally reduced if the informative observations are obtained.

[1]  F. Schweppe Recursive state estimation: Unknown but bounded errors and system inputs , 1967 .

[2]  J. O'Rourke,et al.  Model-based image analysis of human motion using constraint propagation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bruce Lowerre,et al.  The Harpy speech understanding system , 1990 .

[4]  Rodney A. Brooks,et al.  Symbolic Reasoning Among 3-D Models and 2-D Images , 1981, Artif. Intell..

[5]  Y. F. Huang,et al.  On the value of information in system identification - Bounded noise case , 1982, Autom..

[6]  Masanobu Yamamoto,et al.  Human motion analysis based on a robot arm model , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Mubarak Shah,et al.  Recognizing Hand Gestures , 1994, ECCV.

[8]  Thomas S. Huang,et al.  Virtual Gun, A Vision Based Human Computer Interface Using the Human Hand , 1994, MVA.

[9]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[10]  M Mochimaru,et al.  The three-dimensional measurement of unconstrained motion using a model-matching method. , 1994, Ergonomics.

[11]  M. Werman,et al.  Recognition and localization of articulated objects , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[12]  Yoshiaki Shirai,et al.  Hand gesture recognition using computer vision based on model-matching method , 1995 .

[13]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.