论文信息 - Shadow puppetry

Shadow puppetry

The mapping between 3D body poses and 2D shadows is fundamentally many-to-many and defeats regression methods, even with windowed context. We show how to learn a function between paths in the two systems, resolving ambiguities by integrating information over the entire length of a sequence. The basis of this function is a configural and dynamical manifold that summarizes the target system's behaviour. This manifold can be modeled from data with a hidden Markov model having special topological properties that we obtain via entropy minimization. Inference is then a matter of solving for the geodesic on the manifold that best explains the evidence in the cue sequence. We give a closed-form maximum a posteriori solution for geodesics through the learned density space, thereby obtaining optimal paths over the dynamical manifold. These methods give a completely general way to perform inference over time-series; in vision they support analysis, recognition, classification and synthesis of behaviours in linear time. We demonstrate with a prototype that infers 3D from monocular monochromatic sequences (e.g., back-subtractions), without using any articulatory body model. The framework readily accommodates multiple cameras and other sources of evidence such as optical flow or feature tracking.

Matthew Brand | M. Brand

[1] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2] Jr. G. Forney,et al. The viterbi algorithm , 1973 .

[3] Donald R. Smith. Variational methods in optimization , 1974 .

[4] David C. Hogg. Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[5] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6] Stephen M. Omohundro,et al. Nonlinear manifold learning for visual speech recognition , 1995, Proceedings of IEEE International Conference on Computer Vision.

[7] Pietro Perona,et al. Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8] Ioannis A. Kakadiaris,et al. Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9] L. Davis,et al. el-based tracking of humans in action: , 1996 .

[10] James M. Rehg,et al. Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[11] W. Freeman,et al. Bayesian Estimation of 3-D Human Motion , 1998 .

[12] Jitendra Malik,et al. Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[13] Matthew Brand,et al. Voice puppetry , 1999, SIGGRAPH.

[14] Matthew Brand,et al. Pattern discovery via entropy minimization , 1999, AISTATS.

[15] Matthew Brand,et al. Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[16] Ioannis A. Kakadiaris,et al. Model-Based Estimation of 3D Human Motion , 2000, IEEE Trans. Pattern Anal. Mach. Intell..