Body Pose Tracking From Uncalibrated Camera Using Supervised Manifold Learning

We present a framework to estimate 3D body configuration and view point from a single uncalibrated camera. We model shape deformations corresponding to both view point and body configuration changes through the motion. Such observed shapes present a product space (different configurations × different views) and therefore lie on a two dimensional manifold in the visual input space. The approach we introduce here is based on learning the visual observation manifold in a supervised manner. Instead of learning an embedding of the manifold, we learn the geometric deformation between an ideal manifold (conceptual equivalent topological structure) and a twisted version of the manifold (the data). For the case of a walking motion, we use a torus manifold to represent the data. In our experiment, we learned the torus manifold from synthetic data and estimated view and body configuration for circular walking sequence in HUMANEVA-I data set. 3D body pose was inferred from estimated body configuration parameter using only one training 3D body pose cycle in each subject. Experimental results show accurate estimation of 3D body pose and view from a single camera.