Real-time pose estimation of articulated objects using low-level motion

We present a method that is capable of tracking and estimating pose of articulated objects in real-time. This is achieved by using a bottom-up approach to detect instances of the object in each frame, these detections are then linked together using a high-level a priori motion model. Unlike other approaches that rely on appearance, our method is entirely dependent on motion; initial low-level part detection is based on how a region moves as opposed to its appearance. This work is best described as pictorial structures using motion. A sparse cloud of points extracted using a standard feature tracker are used as observational data, this data contains noise that is not Gaussian in nature but systematic due to tracking errors. Using a probabilistic framework we are able to overcome both corrupt and missing data whilst still inferring new poses from a generative model. Our approach requires no manual initialisation and we show results for a number of complex scenes and different classes of articulated object, this demonstrates both the robustness and versatility of the presented technique.

[1]  Vincent Lepetit,et al.  Bridging the Gap between Detection and Tracking for 3D Monocular Video-Based Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  B. Triggs,et al.  Tracking Articulated Motion with Piecewise Learned Dynamical Models , 2004 .

[3]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  David A. Forsyth,et al.  Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Michael J. Black,et al.  Cardboard people: A parametrized model of articulated motion , 1996 .

[6]  A. Fathi,et al.  Human Pose Estimation using Motion Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[8]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[9]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[11]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[12]  Andrew Blake,et al.  Mathematical modelling of animate and intentional motion. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[13]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Ramakant Nevatia,et al.  Human Pose Tracking Using Multi-level Structured Models , 2006, ECCV.

[15]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  K. Lempert,et al.  CONDENSED 1,3,5-TRIAZEPINES - IV THE SYNTHESIS OF 2,3-DIHYDRO-1H-IMIDAZO-[1,2-a] [1,3,5] BENZOTRIAZEPINES , 1983 .

[17]  Daniel P. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  J. Cutting,et al.  Recognizing friends by their walk: Gait perception without familiarity cues , 1977 .

[19]  Yang Song,et al.  Learning probabilistic structure for human motion detection , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[21]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[22]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[23]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[24]  Lorenzo Torresani,et al.  Space-Time Tracking , 2002, ECCV.