Unsupervised Learning of Human Motion

An unsupervised learning algorithm that can obtain a probabilistic model of an object composed of a collection of parts (a moving human body in our examples) automatically from unlabeled training data is presented. The training data include both useful "foreground" features as well as features that arise from irrelevant background clutter - the correspondence between parts and detected features is unknown. The joint probability density function of the parts is represented by a mixture of decomposable triangulated graphs which allow for fast detection. To learn the model structure as well as model parameters, an EM-like algorithm is developed where the labeling of the data (part assignments) is treated as hidden variables. The unsupervised learning technique is not limited to decomposable triangulated graphs. The efficiency and effectiveness of our algorithm is demonstrated by applying it to generate models of human motion automatically from unlabeled image sequences, and testing the learned models on a variety of sequences.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[3]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  C. Tomasi Detection and Tracking of Point Features , 1991 .

[8]  Randal C. Nelson,et al.  Detecting activities , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Karl Rohr,et al.  Incremental recognition of pedestrians from image sequences , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[11]  Michael Isard,et al.  3D position, attitude and shape input using video tracking of hands and lips , 1994, SIGGRAPH.

[12]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[13]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[14]  Yali Amit,et al.  Graphical Templates for Model Registration , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[16]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[17]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[18]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[19]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[20]  Pietro Perona,et al.  Unsupervised learning of models for object recognition , 2000 .

[21]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[22]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[23]  Nir Friedman,et al.  Being Bayesian about Network Structure , 2000, UAI.

[24]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[25]  Nir Friedman,et al.  Learning Bayesian Networks from Data , 2001 .

[26]  David A. Forsyth,et al.  Human Tracking with Mixtures of Trees , 2001, ICCV.

[27]  Nathan Srebro,et al.  Maximum likelihood bounded tree-width Markov networks , 2001, Artif. Intell..

[28]  Yang Song,et al.  Monocular Perception of Biological Motion in Johansson Displays , 2001, Comput. Vis. Image Underst..

[29]  P. Perona,et al.  A probabilistic approach to human motion detection and labeling , 2003 .