Ambiguities in Visual Tracking of Articulated Objects Using Two- and Three-Dimensional Models

Three-dimensional (3D) kinematic models are widely-used in video-based figure tracking. We show that these models can suffer from singularities when motion is directed along the viewing axis of a single camera. The single camera case is important because it arises in many interesting applications, such as motion capture from movie footage, video surveillance, and vision-based user-interfaces. We describe a novel two-dimensional scaled prismatic model (SPM) for figure registration. In contrast to 3D kinematic models, the SPM has fewer singularity problems and does not require detailed knowledge of the 3D kinematics. We fully characterize the singularities in the SPM and demonstrate tracking through singularities using synthetic and real examples. We demonstrate the application of our model to motion capture from movies. Fred Astaire is tracked in a clip from the film “Shall We Dance”. We also present the use of monocular hand tracking in a 3D user-interface. These results demonstrate the benefits of the SPM in tracking with a single source of video.

[1]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[2]  Takeo Kanade,et al.  DigitEyes: vision-based hand tracking for human-computer interaction , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[3]  James M. Rehg,et al.  Analyzing articulated motion using expectation-maximization , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[5]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[6]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[9]  Masanobu Yamamoto,et al.  Human motion analysis based on a robot arm model , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[11]  James M. Rehg,et al.  Reconstruction of 3D figure motion from 2D correspondences , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[13]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[14]  James M. Rehg,et al.  Video editing using figure tracking and image-based rendering , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[15]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[16]  Pradeep K. Khosla,et al.  The resolvability ellipsoid for visual servoing , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Andrew Blake,et al.  A Probabilistic Exclusion Principle for Tracking Multiple Objects , 2000, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Michael Girard,et al.  Computer animation of knowledge-based human grasping , 1991, SIGGRAPH.

[20]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[22]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[23]  Michal Irani,et al.  Computing occluding and transparent motions , 1994, International Journal of Computer Vision.

[24]  Kenichi Kanatani,et al.  Gauges and gauge transformations for uncertainty description of geometric structure with indeterminacy , 2001, IEEE Trans. Inf. Theory.

[25]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26]  Mark W. Spong,et al.  Robot dynamics and control , 1989 .

[27]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[28]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[29]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[30]  Lee E. Weiss,et al.  Dynamic sensor-based control of robots with visual feedback , 1987, IEEE Journal on Robotics and Automation.

[31]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  J. Michael McCarthy,et al.  Introduction to theoretical kinematics , 1990 .

[33]  James M. Rehg,et al.  Reconstruction of 3-D Figure Motion from 2-D Correspondences , 2001, CVPR 2001.

[34]  James M. Rehg Motion Capture from Movies , 2000 .

[35]  Takeo Kanade,et al.  DigitEyes: Vision-Based Human Hand Tracking , 1993 .

[36]  James M. Rehg,et al.  TM Singularities in Articulated Object Tracking with 2-D and 3-D Models , 1997 .

[37]  James M. Rehg Visual analysis of high DOF articulated objects with application to hand tracking , 1995 .

[38]  Takeo Kanade,et al.  Visual tracking of a moving target by a camera mounted on a robot: a combination of control and vision , 1993, IEEE Trans. Robotics Autom..

[39]  Vladimir Pavlovic,et al.  A dynamic Bayesian network approach to figure tracking using learned dynamic models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[40]  Michael J. Black,et al.  Automatic Detection and Tracking of Human Motion with a View-Based Representation , 2002, ECCV.

[41]  Olivier D. Faugeras,et al.  3D Articulated Models and Multiview Tracking with Physical Forces , 2001, Comput. Vis. Image Underst..

[42]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[43]  Luc Robert Camera Calibration without Feature Extraction , 1996, Comput. Vis. Image Underst..

[44]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[45]  Rajeev Sharma,et al.  Motion perceptibility and its application to active vision-based servo control , 1997, IEEE Trans. Robotics Autom..

[46]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[47]  Ingemar J. Cox,et al.  An Efficient Implementation of Reid's Multiple Hypothesis Tracking Algorithm and Its Evaluation for the Purpose of Visual Tracking , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[49]  Patrick Rives,et al.  A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..

[50]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[51]  David C. Hogg,et al.  Wormholes in shape space: tracking through discontinuous changes in shape , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[52]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[53]  Luc Robert,et al.  Camera calibration without feature extraction , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[54]  J. O'Rourke,et al.  Model-based image analysis of human motion using constraint propagation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  D. Pai Singularity, uncertainty and compliance of robot manipulators , 1988 .

[56]  Yoshihiko Nakamura,et al.  Advanced robotics - redundancy and optimization , 1990 .

[57]  Michael J. Black,et al.  A framework for modeling the appearance of 3D articulated figures , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[58]  Patrick Rives,et al.  A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..

[59]  Yoshiaki Shirai,et al.  Hand gesture estimation and model refinement using monocular camera-ambiguity limitation by inequality constraints , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[60]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[61]  Pradeep Kumar Khosla,et al.  Real-time control and identification of direct-drive manipulators (robotics) , 1986 .